Code

Discussion on Crawlomatic Multisite Scraper Post Generator Plugin for WordPress

Discussion on Crawlomatic Multisite Scraper Post Generator Plugin for WordPress

Cart 1,142 sales
Recently Updated

CodeRevolution supports this item

Supported

This author's response time can be up to 1 business day.

496 comments found.

Hello, how to exclude/strip this from content?

<a href="https://twitter.com/xxxxxx" title="Visit xxxxxx page" target="_blank"><img decoding="async" src="https://xxxxxxx.com/wp-content/uploads/2023/11/xxxxLogo.png" style="max-width:32px;height:auto;border:0px;margin:0px;padding:0px;" /></a>
And how to include this to content
<div class="article-source">Source<!-- -->: <!-- -->XXXXXX</div>

Hello,

As the link you want to strip does not have an ID or class, it can be stripped using Regex only. You can add your Regex which matches this html code to the ‘Run Regex On Content’ settings field, from rule settings.

To add additional data to posts, add the text to the ‘Generated Post Content’ settings field, in rule settings.

Regards, Szabi – CodeRevolution

Hello, can you help to show me fix an issue:

Today the scrap was successful but the scrapped article when shared on Facebook did not display the image that came from the featured image, but instead took the image from the content. og:image should take the image from the featured image, not from the content.

But the Featured Image in the post is really taken from the Image we want. But only og:image is problematic. This didn’t happen before

Hello,

Please send the url of the scraped site to my email kisded@yahoo.com and i check on this.

Regards, Szabi – CodeRevolution

Hello, How do I make the scrap result have the same date as the source. Because currently the publication date of the scrap results is not the same as the source publication date.

As an example:

Source publication date January 31, 2024 But, Scrap results publication date on my website is February 01, 2024

So now I have to change the manual publish date to 31 January 2024.

Please help

Hello,

Thank you for your purchase.

For this, you will need to point the scraper to the publish date of the original article, available on the source article page.

For this, you can use the following settings fields in rule settings of Crawlomatic:

Date Query Type

Date Query String

Regards, Szabi – CodeRevolution

Hi, is this plugin works in Bulgarian (Cyrillic), URLs, author name, title and so on?

Hello,

Thank you for contacting me.

Yes, the plugin will be able to scrape content also in Bulgarian (Cyrillic).

Regards, Szabi – CodeRevolution

Hi, thank you for the plugin which has made my work easier.

I have several questions regarding the issue:

1. I set schedule 24 hours and Run This Rule Now on Date January 30, 2024 at 06:00 but in the draft on Date January 31, 2024 I see Last Modified 01/31/2024 at 7:02 am.

Why is there a 1 hour time difference? How do we set the post time to match what we want if we later make an autopublish. for an example i want scrap results to auto publish at 06:00

2. I created 3 Scraper Start (Seed) URLs. on the date January 30, 2024

All Schedule 24

ID: 1 (Max Posts: 12) ID: 2 (Max Posts: 2) ID: 3 (Max Posts: 5)

And I Run This Rule Now at 06:00

On the date January 31, 2024

ID: 1 (Max Posts: 12) Unsuccessful ID: 2 (Max Posts: 2) Successfully drafted at 07:02 ID: 3 (Max Posts: 5) Successfully drafted at 07:02

Why ID: 1 doesn’t work. Even though I set the same as ID: 2

Hello,

Thank you for your purchase.

From the sounding of this issue, your site might have problems with wp_cron functionality of WordPress.

To fix this, you will need to replace wp_cron with a server side cron. Please check this page for details on how to do this: https://themeisle.com/blog/disable-wp-cron/

If you don’t know how to make these changes, please contact your hosting provider’s support and ask them to fix wp_cron for your server.

After this is fixed, the plugin will be able to run automatically without issues, as it is using the wp_cron system for the scheduling part, which will be replaced with the server side cron, which is much more trustworthy.

Regards, Szabi – CodeRevolution

Hello, Thanks for the plugin, it work. But I have a few question how to strip a few things from source:

1. <div class="sharethis-inline-share-buttons" style="margin-top: 0px; margin-bottom: 0px;" /> 

2. <blockquote class="wp-embedded-content" data-secret="dSaYkJzbZt">

3. <iframe class="wp-embedded-content" style="position: absolute; clip: rect(1px, 1px, 1px, 1px);" 

4. <div class="mh-social-bottom">
<div class="mh-share-buttons clearfix" />
</div>
,</blockquote>

Hello,

Thank you for your purchase.

Please try adding the below to the ‘Strip HTML Elements by Class ’ settings field in rule settings:

Strip HTML Elements by Class

sharethis-inline-share-buttons,wp-embedded-content,mh-social-bottom

Regards, Szabi – CodeRevolution

Thanks, it worked

I am glad to help.

I try with this page: https://www.univision.com/famosos/ and answer no post. what page work

Hello,

I checked and the most optimal way to scrape this site is scraping their sitemap. Please check them below: https://www.univision.com/sitemap_famosos_1.xml

And: https://www.univision.com/sitemap_famosos_2.xml

Tutorial video on sitemap scraping: https://www.youtube.com/watch?v=xi1S1093ubo

I hope this info helps.

Regards, Szabi – CodeRevolution.

Hello, I wanted to know if your plugin can extract, like WP Automatic does, posts from a sitemap from a specific date, for example, everything that was published after December 20, 2023, thank you

Hello,

Thank you for contacting me.

Extracting posts from sitemaps is possible, please check this video for details: https://www.youtube.com/watch?v=xi1S1093ubo

Importing posts only after a specific date is not possible, but idea noted, will think on adding it in upcoming updates.

Regards, Szabi – CodeRevolution

Can you give me the main settings how to do it as test or photo for settings? Thanks

Hello,

Thank you for contacting me.

Please provide more details on what exactly do you want to achieve using the plugin, so I can get a better understanding of your question.

Regards, Szabi – CodeRevolution

Hello Can you give us video how to use it to get from website booking .com hotels for city and country ? Thanks

Sure, noted.

Thanks, wish you M XMS and happy new year.

Same to you too!

Hi,

I’m using SpinnerChief as my spinner. It failed to do an article spin.

The error message in the log is: “SpinnerChief” failed to spin article – titleseparator not found

I tried to find the setting for title separator. But I could only find category separator and tag separator.

Where do I find this settings?

Hello,

I updated the plugin on your site, now spinning will work, please check.

Let me know if it worked for you.

Regards.

Hi,

It’s still not working. The scrapped content is still the same (not spinned).

But this time, there are no error messages in the log.

Hello,

This time, the issue seems to be on SpinnerChief’s side, as they are returning the same text which was inputted into their API.

Can you contact their support and ask about this issue? I will provide details to your email about the API call which is made by the plugin, please send them details about this.

Regards.

Hello,

I would like to know how to configure crawlomatic to import posts without including any images.

My goal is to import only the text of the articles, excluding the featured image, internal images, or creating hotlinks with images from the original post.

Thank you in advance for your help.

Hello,

Thank you for contacting me.

Yes, this is possible. Please check the ‘Strip Images From Post Content’ checkbox in rule settings -> save settings -> import new posts.

Regards, Szabi – CodeRevolution

Hi,

Since Wordpressomatic is limited to only sites that are in wordpress I have tested with Crawlomatic, however, by entering 10 posts per 24 hours and executing “run this rule now” the action takes effect only in the home, it does not bring 10 posts to the testing site but only the home or the specific url that you have placed,

If I want to bring more posts to the testing site I must include each url one by one, that would be almost the same as doing it manually (copy and paste)

What should I do so that it really fulfills the function of bringing 10 or a certain number of posts every hour or every 24 hours?

Hello,

Thank you for contacting me.

The Crawlomatic plugin needs to be set up to recognize which posts do you want to scrape and import. Please check these tutorial videos for details: https://www.youtube.com/watch?v=F6vhRJgCR_M&list=PLEiGTaa0iBIgcqNzVBaoTCS4ws47vNMuQ

Regards, Szabi – CodeRevolution

Hi Szabi, me again ;) I would like to get the Crawlomatic. I’ve played around a bit with the demo, but haven’t found a solution yet. The pages I want to crawl are my own. I would like to rewrite the text afterwards with Aiomatic. Rewriting shouldn’t be a problem, I’ve managed it in the meantime.

But I’m still stuck:

The original post contains a shortcode:

It is important that this butto is included on the new website. Unfortunately, I haven’t managed to do this yet. Can I define somewhere that it finds this shortcode and crawls it (and in the best case is not “destroyed” by Aiomatic)?

Thank you!

Hello,

Thank you for contacting me.

Please send me the URL you want to scrape and highlight also the button in it, and I check on this on my test site.

My email is kisded@yahoo.com

Regards, Szabi – CodeRevolution

Your Mail-Adress didn’t work

Interesting. Please try support@coderevolution.ro

Let me know if it worked.

Regards.

hi Szabi, its possible to crawle product sku?

Hello,

Thank you for your purchase.

Yes, scraping SKU to products is possible, for this, you have to create a custom shortcode in the plugin, where you store the SKU value and assign it to the _sku custom post field, for WooCommerce to recognize it.

For this, you need to set the following settings (customized for the specific site you want to scrape, which you sent in email – for other URLs, this might change):

Custom Shortcode Creator
sku => regex @@ #"productID":"([^"]*?)"#

Post Custom Fields
_sku => %%sku%%
 

I will get back to you in email, with more settings for the plugin for the specific URL you want to scrape.

Regards, Szabi – CodeRevolution.

Hi

Is it possible to scrape from multiple websites at once from a file/list.

I just want to add a 100 websites and scrape then logo, website url (minus the extension), meta description

Thanks

Rob

Hello,

Thank you for contacting me.

The plugin can scrape a list of websites, however, the scraped content can be published as WordPress posts, cannot be stored in lists. Also, the logo cannot be scraped.

To get exactly what you need, a custom plugin needs to be implemented.

Regards, Szabi – CodeRevolution

thanks. I dont want to create a list. I have a list of websites in a spreadsheet. Can I scrape in bulk? eg. I need to scrape information from the home page of each website, but need to be able to just paste the list in. Can this be done? Thanks

Yes, adding a list of sites is possible – you can add the links in a txt file, upload it to your server and the plugin will be able to read the links from it, however, auto detecting the images and description of sites can be a bit tricky.

Video for adding list of links: https://www.youtube.com/watch?v=Gzwle15-PN4

Regards.

Hi, just a pre sale question, Does this plugin have ai image feature? Or has the option to use aiomatic AI image feature cause I have aiomatic.

Hello,

Thank you for contacting me.

Yes, you can combine Aiomatic with Crawlomatic, as Aiomatic will be able to generate AI images for scraped posts. This tutorial will help: https://www.youtube.com/watch?v=s_jbc5rnG1E

Regards, Szabi – CodeRevolution.

How to record the source of articles posted on google news. Example article taken from “https://www.cbssports.com/nfl/news/nfl-week-1-grades-bengals-get-an-f-for-blowout-loss-to-browns-49ers- get-an-a-for-destroying-steelers/” will be added at the end of the article: Source: cbssports.com

Hello,

Thank you for your explanation. This is currently not possible, however, if you want to get this functionality, I can make a payed custom plugin update for the plugin, and I add this feature to the plugin.

For details, please contact me at my email kisded@yahoo.com

Regards.

Hello, we are very interested in your plugin. We would like to know if we can get technical support from you. Naturally, for a fee, to help us set up your plugin on our site. We have specific sites that we are interested in. We would like to know if you can make us a ready-made setup (sample) so that we can download articles from the necessary sites. Please let us know if we can somehow write you a personal Message. For details. Thank You!

Hello,

Thank you for contacting me.

Yes, this is possible, you can contact me for details at my email address kisded@yahoo.com

Regards, Szabi – CodeRevolution.

Hi,

Is it possible to use Packetstream proxies link?

Hello,

I am not sure if this is possible, i haven’t tried it. I suggest you ask the proxy provider’s support if their proxy link can be used in PHP Curl. If the response is positive, the proxy will work in the plugin.

Regards, Szabi – CodeRevolution

by
by
by
by
by
by

Tell us what you think!

We'd like to ask you a few questions to help improve CodeCanyon.

Sure, take me to the survey