Discussion on WP Content Crawler - Get content from almost any site, automatically!

turgutsaricam

turgutsaricam supports this item

Supported

This author's response time can be up to 2 business days.

238 comments found.

dreamgle

dreamgle Purchased

Dear

How can i remove links from post content (without delete it) How can i detect date if it’s formated dd/mm/Y

Thanks

As I said, you cannot use the same license on more than one domain and the notification is shown indicating that. If you do not have the plugin activated on another domain, please send your license key through my profile page so that I can reset your license’s domain settings.

I checked the source code of the plugin. You should provide “[wcc-main-title]” as the post title template. More precisely, you should not leave the title template empty. If the title template is empty, the plugin does not replace the shortcodes, since it assumes there is no shortcode in the post title. So, writing “[wcc-main-title]” should solve your issue. I noted this bug and I will fix it in the next update.

dreamgle

dreamgle Purchased

Thank you very much. May i ask, how can i extract background-image URL from this HTML code
<a class="pull-left thumb" href="/abc.html">
                                    <img src="/Content/images/space-2.gif" alt="" style="background-image: url('http://static.domain.com/uploads/Images/IMG_3210.JPG.280.0.cache" />
                            </a>

I need to get the URL Image to set it as featured image: http://static.domain.com/uploads/Images/IMG_3210.JPG

The selector is: .pull-left thumb img But it return the result is: /Content/images/space-2.gif

Please help me, thank you

First, you need to clean the unnecessary part from “src” attribute’s value. You can do this by using “find and replace in element attributes” option, such as

Regex: checked
Find: ^.*?url\('(.*?)\.[0-9].*?$
Replace: $1

Test value: background-image: url('http://static.domain.com/uploads/Images/IMG_3210.JPG.280.0.cache" 
After replacement: http://static.domain.com/uploads/Images/IMG_3210.JPG

Then, you need to replace “src” and “style” attributes’ values with each other. You can do this using “exchange element attributes” option.

After these steps, your image selector should work just fine.

Hi, I want to know that if I buy this plugin, can I use it for multiple domains or I need license for each domain? thanks

Hi,

You need a license for each domain.

Hi , I’ve installed the plugin but when i click on the “post” tab there is nothing there to fill. It does not look the same as the demo. What should i do ?

Hi,

Make sure your PHP version is at least 5.6 and mbstring extension is enabled. If these requirements are met, you can try disabling other plugins to make sure they do not interfere with the plugin. If this does not work either, you can enable WP’s debug mode and debug display (Debugging in WordPress) to check if there is an error shown when you browse the site settings page.

Got it. So 7.1 is definitely not working , i tried with 7.0 and its all good. Thank you!

I’m glad it is resolved. Thanks for letting me know.

Selam kardeşim. 3 gün falan nulled versiyonunu kullandım denemek amaçlı. Gerçekten güzel program emeğine sağlık. Hemen satın aldım zaten :) Hakkını helal et.

“—” bu tarz karakter problemlerini nasıl gideririrm?

Merhaba

Genel ayarlardan Utf8 kullan secenegini işaretleyerek deneyebilirsiniz. Olmazsa sayfanin kaynak kodundan karakter setini “sayfa yuklendiginde bul ve degistir” secenegini kullanarak utf8 ile degistirmeyi deneyin.

adminha

adminha Purchased

I get a 404 error when I hit “Save Changes” button in general settings page. It’ll be redirected to : http://www.mydomain.com/wp-admin/admin-post.php and shows 404 page. What’s wrong? I’ve just changed the scheduling settings, nothing more!

Hi,

It looks like you have a security issue. This might be related to HTTP user agent and HTTP accept settings. Please try saving the settings after you delete the values of those options.

adminha

adminha Purchased

Wow ! you’re great! it worked!

Thanks. I’m glad it worked :)

Suddenly I get a warning that the license has expired. What should I do?

The plugin does not prevent functions from working if the license key is entered. You probably have another problem. You can check your debug file after enabling the debug mode to see if there is something wrong. If you do not see any error messages related to the license key, it means the plugin won’t prevent functions from running.

When I click the save change button in the license settings, I get a warning that the license has expired again. Is your license server normal?

Your Content Crawler license has expired. Please get a new license until 06/04/2017 15:05 to continue using Content Crawler.

Message: The license could not be checked with the server. Please try saving your license settings again in a few minutes. If the error persists, please contact the developer.

Yes, it works normally.

Hi there,

I was thinking about developping a crawler, but you did it. My use of this kind of plugin will be very particular : I just want to crawl some websites to take their product ratings, and save/update these values in custom fields.

Additionnally, I expect to setup the cron’s jobs based on the custom type or taxonomy, in order to perfectly manage my server ressources.

Do you think your plugin will fit my needs? Does your plugin have an API?

Cheers.

Yes, you can use WordPress’ actions. However, I do not get why you want to use a shortcode. The plugin does not have a shortcode that you can use to manually crawl a post. Also, it is not possible to pass just one selector to the plugin’s built-in methods.

On the other hand, you can use the plugin’s built-in methods to define your own shortcode. You just need to call the right methods. Then, you can retrieve any data you want. After the data is retrieved, you can do anything with the data. So, in this case, you sort of need to write an API for your use case. You can create your own CRON jobs and retrieve the data using the plugin’s methods whenever you like. Again, I do not understand why a shortcode is needed. Just define CRON jobs, call the plugin’s methods when the events are fired, do anything with the retrieved data.

Hi there,

The thing is I expected to automate the scrape process. Here is some explanation of what I expected :

  • My website will be a huge database, in which each post describes a product : video game, software, service… In each product/post, there will be a custom field which will contain (let’s say) the rating of the product scraped in a particular website. Your plugin will interract directly there, by scraping and filling the rating field
  • In order to automate the process of scrape, I expect to use two additionnal custom fields : the website page where I need to scrape the needed data AND the css selector. By this way, only by entering the website page url and the CSS selector in these custom fields, combined with a save post action, and a do_shortcode() in which I expect to be able to put these 2 additional fields : url and css selector, the post will be updated with the scraped rating
  • The 2 first steps will allows me to fill the ratings in the custom field. The thing now is to have these ratings up to date. And in order to manage my server ressource, I would like to define when products from custom post “Game” with taxonomy “action” need to be updated… In order to launch these cron jobs in off-peak. Do you get it?

Cheers.

Well, what is certain is that you need to modify a lot of source code or write a plugin that can benefit from WP Content Crawler’s built-in methods, or write a plugin from scratch. Apparently, the plugin is not designed for what you want to achieve.

omwap

omwap Purchased

presale question

can this plugin grab all the item reviews on this site goo.gl/DzpTS6

for example, an item like this goo.gl/7vv9NP

on the bottom of each item page, there is a link to review page goo.gl/vbYOi8

after going to the review page, there may be also next page links https://goo.gl/aR7Vk0

I tried setting it up in the demo, but it only grabbed one review page, it doesn’t go to the next review page to grab more. How do you set it up?

Hi,

Other pages of the review section are loaded via AJAX and there is no next page URL in the page. In other words, since you cannot show a next page URL to the plugin, it cannot get other pages of the reviews.

omwap

omwap Purchased

on bottom of each item page, if you click on the link show in this picture goo.gl/vbYOi8 then you will go to this page goo.gl/WmEqmU , the there are page links that are not ajax

Then you can use that link. If that link does not exist on product page, you can try to create the link on the product page by using find and replace options. Then you can write a selector for the newly created link.

sıralı sayfalardaki yazıları çekmiyor? 1. sayfayı çekiyor ama 2. ve diğer sayfaları nasıl çekebilirim?

Merhaba,

Tüm sayfa URL’leri için bir seçici yazarak sayfa linklerini bulduruyorsanız ve ilk sayfanın linki de tüm sayfa linkleri arasındaysa, gereksiz eleman seçicilerini kullanarak ilk sayfayı kaldırın. Eğer tüm sayfa seçicilerini kullanmıyorsanız ve sadece sonraki sayfa seçicisini kullanıyorsanız, sonraki sayfa seçicinizin doğru olduğundan emin olun.

ilk sayfadaki yazıları seçiyorum. 10 tane yazı sonraki sayfayada sayfa aralıklarını seçtiriyorum ama yine de 10 tane yazı buluyor yapamadım bir türlü.

Kategori sayfasından bahsediyorsunuz sanırım. Testçi’den kategori ayarlarını test ettiğinizde sonraki sayfa URL’si bulunuyor mu? Eğer bulunmuyorsa site ayarlarından “kategori” sekmesi altındaki “Kategori Sonraki Sayfa URL’si Seçicileri”ne girdiğiniz sonraki sayfa seçicisinin doğru olduğundan emin olun. Bu arada eğer hedef sitenin kategorisindeki sayfalar AJAX ile yükleniyorsa eklenti sonraki sayfaları alamaz.

Hi, can you help me how to setup the crawler to scrape only the title and post only the title with the source url, as in this site example http://completelyketo.com/ , I already tried to post in the template /Post Title Template / add the code wcc-main-title , but don’t work

Hi,

That site redirects the visitors to another page in which the post exists. So, the posts are not on that site. In order for the plugin to be able to create a post, it needs to load the source code of the post page. So, when the plugin loads a post page, probably it cannot find the title selectors you defined. You want to create a post directly from a category page, which the plugin cannot do. It needs to load the source code of post pages.

I did not do anything, but once again I got a license warning and all the actions, such as crawl time, are done randomly.

I did not do anything but keep the following message.

Your Content Crawler license has expired. Please get a new license until 12/04/2017 11:09 to continue using Content Crawler.

Message: The license could not be checked with the server. Please try saving your license settings again in a few minutes. If the error persists, please contact the developer.

Debug enabled and nothing was logged. Can you check my license?

Or can you check with my admin account?

The plugin does not unregister your license key. If it does not exist, it registers your key. So, the problem does not seem to be caused by the plugin. It is probably related to your server, because the license is checked without any problems when I try on another site, which is on a GoDaddy server.

This comment is currently being reviewed.

This comment is currently being reviewed.

Sorry, I answer only the support questions asked by buyers.

edniso

edniso Purchased

Hello, I get the error saying that my license has expired. It was working fine without any problem.

“Your Content Crawler license has expired. Please get a new license until 14/04/2017 20:06 to continue using Content Crawler.

Message: The license could not be checked with the server. Please try saving your license settings again in a few minutes. If the error persists, please contact the developer.”

Hi,

The server has been down. It should be fixed in a short period of time. In the meantime, you do not need to worry about the error message, since you can use the plugin for next 3 days without any restrictions. In this period of time, the server will probably be up. Thank you for your patience.

The server is up again. After you save your license settings again, the error message should be gone.

I run tester tool, it still crawl the post content. Here is the information:

Date: 5:15 pm April 10, 2017 (2017-04-10 17:15:36) Memory Used: 0.42 MB Time: 917.16 ms

I run it by virtual box in desktop which has 2 cores and 4 threads of cpu and 4 gb ram

This bug was in your 1.2.1 version

I upgraded to your latest version. But it can’t crawl more than 8600 posts. Can your plugin only crawl about 8600 posts?

I run tester tool, it still crawl the post content. Here is the information:

Date: 5:15 pm April 10, 2017 (2017-04-10 17:15:36) Memory Used: 0.42 MB Time: 917.16 ms

I run it by virtual box in desktop which has 2 cores and 4 threads of cpu and 4 gb ram

This bug was in your 1.2.1 version

Hi,

No, the plugin does not limit the number of posts that can be crawled.

So Ive set up all to work as it should however i have problem with activating it for scheduling. Im not sure why since i have ticked both options to switch the scheduling on. Could you please help me with this ?

There might be a problem either with your category settings or your CRON setup. Please make sure your category settings are correct by using the tester. If everything looks OK there, please install WP Crontrol and check Tools > Cron Events page. Please let me know if there is an error message displayed on that page.

Category was the problem. Works like charm. Thank you !

You are welcome :)

Güzel bir eklenti teşekkürler

Merhaba,

Teşekkürler, beğenmenize sevindim.

by
by
by
by
by
by