Discussion on WP Content Crawler - Get content from almost any site, automatically!

turgutsaricam

turgutsaricam supports this item

Supported

This author's response time can be up to 2 business days.

193 comments found.

Hi,

Can I delete the folder ( .git & .idea ) from the previous version(1.4.1) when I update to 1.5.1?

And I would appreciate a little more hint about adding recrawling buttons on post. I have implemented the delete post button by adding function wp_delete_post_link($link = 'Delete This', $before = '', $after = '') { global $post; if ( $post->post_type == 'page' ) { if ( !current_user_can( 'edit_page', $post->ID ) ) return; } else { if ( !current_user_can( 'edit_post', $post->ID ) ) return; } $link = "<a href="" . wp_nonce_url( get_bloginfo(" url="" .="">ID, 'delete-post_' . $post->ID) . "'>".$link."</a>"; echo $before . $link . $after; to function.php and adding <?php wp_delete_post_link('Delete This'); ?> to single.php. But I dont even know how to use \WPCCrawler\Factory::toolsController()->recrawlPostManually($postId); :cry: Do I have to do this in the function.php?

Thank you for adding dashboard updates.

Hi,

Yes, you can delete .git and .idea folders. I forgot to delete them. The latest version does not have those files. So, when you update, they will be removed automatically.

First of all, I should say that this is completely out of the supported area. However, since you are so kind, just for this one time, I will give you the code that you can use to achieve your goal.

The code below should go to your theme’s functions.php. You can read the comments to understand what it does.

/**
 * Shows recrawl link for a post and recrawls the post depending on the request
 */
function wpcc_show_recrawl_link_and_maybe_recrawl() {
    // Stop this if the current user is not able to manage the options or Factory class does not exist.
    if(!current_user_can('manage_options') || !class_exists('\\WPCCrawler\\Factory')) return;

    // Get the URL to be recrawled to make sure this post was saved by WP Content Crawler
    $targetUrl = get_post_meta(get_the_ID(), '_wpcc_post_url', true);

    // Do not show anything if there is no URL.
    if(!$targetUrl) return;

    // Name of the parameter that indicates if the post should be recrawled or not. This will be added to the URL.
    $paramName = 'wpcc-recrawl';

    // Recrawl the post if the request has the parameter with the right value
    if(isset($_GET[$paramName]) && $_GET[$paramName] == 1) {
        $id = \WPCCrawler\Factory::toolsController()->recrawlPostManually(get_the_ID());

        // Show the result. If there is an ID, the operation was successful.
        echo '<div style="color: red;">';
        echo $id ? "The post has just been recrawled." : "The post could not be recrawled.";
        echo '</div>';
    }

    // Now, show the link to be used to recrawl the post.
    // Get current post's permalink
    $postPermalink = get_the_permalink();

    // Prepare the link with the arguments
    $preparedPermalink = "{$postPermalink}?{$paramName}=1";

    // Show the button
    echo sprintf('<a href="%1$s" class="button">%2$s</a>', $preparedPermalink, "Recrawl this post");
}

To use this, you can use the code below in your theme’s single.php or content-single.php, or a similar file:

<?php if(function_exists('wpcc_show_recrawl_link_and_maybe_recrawl')) wpcc_show_recrawl_link_and_maybe_recrawl(); ?>

You are welcome. I’m glad you like the dashboard.

I understand that this is not your scope of support. When I looked at the code, it was beyond my ability. It works very well. I really appreciate it.:grin::grin::grin:

I was wondering if you could help me. I need to remove the $ and comma from the price that the plugin is scraping. But i need to ONLY remove it from the following element:

<div class="product-price"> $4,967.00 </div>

I am using regex, but cannot figure out how to target only that element in the find/replace.

Thanks in advance

oops sorry, it stripped the code <div class="product-price"> $4,967.00 </div>

Hi,

You can do it like this:

Find: (<div class="product-price">)[\s\n]+?\$([0-9]),([0-9]+)\.([0-9]+)[\s\n]+?(<\/div>)
Replace: $1$2$3$5
</div>

Hrm. That is still getting the $ sign.

Videoda anlattığınız gibi yaptım. Url ekleyip entera bastım ama sayfa açılmadı.

I did what you told me on the video. I added Url and pressed the entertain but the page did not open.

http://i.hizliresim.com/3vmXm0.jpg http://i.hizliresim.com/R1WaW7.jpg

yaklaşık 7-8 site denedim hep öyle oldu. vermiş olduğunuz demo sitenizde ise görüntü mobil sayfa şeklinde görünüyor.kendi wp sitemde bile aynı sorunu veriyor maalesef. güncellemeye kadar bekleyecek miyiz çözümü var mı?

I’ve tried about 5 sites all the time. If you have demo on your site, the image looks like a mobile page.

Could you please send me the sites you’ve tried and faced the same problem via email through my profile page?

I sent.

I am crawling sites with multiple pages in posts. I checked to be crawled, but I want to delete the page in my post, what should I do? I want to delete <!- nextpage -> [wcc-main-title] in the crawled post, but I do not know how to do it.

Should I set the entire <!- nextpage -> to be ignored in the WordPress function? I still think the title will be repeated. Is this the right way?

I could not entirely understand what you try to achieve. Could you please elaborate on that a little bit more?

If your main template has post title in it, yes, it will be repeated.

i sent an email

ahmetw

ahmetw Purchased

Merhaba, booking com dan türkiye tüm otelleri çekmeye çalışlıyorum , aynı oteli defalarca ekliyor içerik çekme eklentisi, bu konuda bana yardımcı olurmusunuz… skype : istanbul_ilker@hotmail.com

teşekkürler.

ahmetw

ahmetw Purchased

Sorry, thank you for your help, my mistake, which works very nicely,

ahmetw

ahmetw Purchased

Can I change the rename when save hotel pictures example 1239.jpg to [hotelname1] .jpg possible?

Otel resimlerini kayıt ederken isim değiştirmek örnek 1239.jpg to [hotelname1].jpg yapmak mümkünmü ?

Sorry, it is not currently possible to rename saved files.

@ turgutsaricam I have tried everything to get this site to crawl. I am not sure what I am doing wrong. Is there any way you could take a look if I sent you the credentials to login?

Please send me your FTP and admin login credentials through my profile page.

I just sent them over. Thanks so much, I am going nuts with this.

Matter of fact, if you can tell me what I am doing wrong and suggest a solution, I will go ahead and buy another copy of your plugin just to show support. It is a great plugin, just a little over my head on the regex I think.

i just purchased your pluging and right after installation got this message “Your Content Crawler license has expired. Please get a new license until 19/02/2017 17:20 to continue using Content Crawler.” can you explain to me what is going on?

You do not need to put the meta values in a template. They are saved to the post_meta table. When they are saved, WooCommerce will use them to show the information about the product.

To format the data, you can use find and replace options.

so I’ve really been struggling with this i did get the named after the meta keys for my store and i selected the post type as product, but it still wouldn’t post any prices, so i decided to change post type to variation_product. that actually stopped the entire process where nothing will post at all. i think if you could put together a short video for this, you could boost our sales. for example i got the plugin for this sole purpose

You are right. I’ve noted your request. I’ll prepare a tutorial and make improvements regarding WooCommerce products when I find the time. Thank you.

Hi,

Would you be able to provide us with a demo? The current preview is not working right now.

Thanks.

Hi, it’s not working for me. I am not able to login with the credentials listed there.

The problem is fixed now. Please try again. Sorry for the inconvenience.

Hello,thank you to adding “Recrawling Post” Option, you Can Complete it by adding Delete Option For pots that are deleted on source websites ( for example listing type websites do delete older posts)

Hi,

You are welcome. This is already in my todo list. I’ll implement this feature when I can find the time. Thanks.

Hello, we are using your Crawler Plugin bundle with QQWorld Auto Save Images and Delete Duplicate Posts to fix some issue while saving posts, we use ‘QQWorld Auto Save Images’ to saving inline images on posts and make it local available on server, (featured image is ok), and using “Delete Duplicate Posts” to delete Duplicate Posts generated by plugin, we have some delay to deleting Duplicate Posts and making images local, please check if you can merge this Features to to your Crawler Plugin?

QQWorld Auto Save Images https://wordpress.org/plugins/qqworld-auto-save-images/ Delete Duplicate Posts : https://wordpress.org/plugins/delete-duplicate-posts/ https://cleverplugins.com/delete-duplicate-posts/

Hi,

Regarding duplicate posts, you can see here. The plugin can save images. There is no need to use another plugin for that. However, it’s your choice.

i have read your faq about duplicated posts, there is 2 type of duplicated posts: scenario one is like you mentioned in FAQ, but scenario two is about different posts with different Content and Url using same Title, in this case we have duplicated posts title

Yes, the plugin uses URLs to check if the post was saved before or not. I’ll add “duplicate post checking via title” to my todo list. Thanks.

i am intrested to your new plugin “WP Post Tools” planning to use it, good job

Thanks. I’m glad you like it.

PeterK19

PeterK19 Purchased

Hi.. Could you please show where you have information on how use category map to achieve showing just a list of links? It is very interesting.

Secondly, when getting content from other sites, copyright is a concerned issue? Thank you very much turgutsaricam!

Hi,

Yes, copyright could be an issue. You need to be careful, since it is your responsibility.

I am not sure if I understood the first question, actually. If you want to show a list of URLs that will be taken from a category page, then you can save the category page as a post. Next, you can use recrawling feature to regularly update the post.

PeterK19

PeterK19 Purchased

Thanks.. clarify my first quesiton, at https://www.youtube.com/watch?v=LQj-HgsKn98 Leandrit Ferizi asked you this question: Any idea how to get the content from a page that shows just a list of links? So you need to click the link and see the article? You showed using category map to achieve it.

Can you please tell me where you have more information on how to use category map or specifically where I can research more? Can a post include links and description also? Good day.

Well, you just add URL of the page that has the post URLs to the category map.

A category is a page that stores URLs of the posts. You add the URLs of the categories to the category map and select a category from your site for them. Then, the plugin goes to those URLs and adds the post URLs in those category pages to the queue. This is explained in the video, actually.

A post can include anything you want. You can use template options under templates tab to achieve that.

botond

botond Purchased

Hello!

I have got 100+ post in quene. What I have to set to all new crawled content save immediately to post?

Thank you!

Hi,

You can increase “Run count for post-crawling event” value under general settings. For more information about the setting, please click the information button next to the name of the setting.

by
by
by
by
by
by