Discussion on Web Crawler and Scraper for Files and Links

XIO

XIO supports this item

Supported

13 comments found.

Can we crawl Mp3 files with this?

Yes. Any type of file. It even tries to find hidden files (like changed extensions).

Can you provide more screenshots or documentation? I’m interested in scraping emails from sub pages of websites, but by your screenshot, it doesn’t look possible.

It is possible. If you don’t set any extensions in the file types box, the program will only search for links and e-mails and save them to the output folder as linkList.txt and emailList.txt.

yeah the same as above, pls upload more screens ! Is it able to scrape the subpages of the site ? & than get the emails of all sites ?

The new version is available now. :)

its better but I mean, that it crawls all subpages of all the pages Ive entered & saves them & that its able to load the files of pages to crawl more so than maybe I got a file with 5000 links, open it with it, than crawl the emails of all this 5000 links :) & this 5000 links are just sublinks from the 100 main links

Yes, you can do that with the application as it is now. :)

I would like to see more screens or a video showing HOW this app will allow me to harvest emails.

Thank you.

More screenshots added.

Few more screenshots or demo would be great…thanks

Added screenshots.

More screenshots or demo please!

Screenshots added.

I’m in for $5. Lets see what this puppy can / will do. But be warned – i’m a fussy customer – I expect things to work.

I have fixed the crashes and uploaded the new version. I am waiting for Envato to confirm it. As soon as it is out, I’ll notify you. :)

Awesome! Am waiting patiently. Thanx for the quick work :)

The updated version is available. :)

I need the source files – is it possible – I want to crawl and search different pages with keywords – can you help me.

I don’t need files – I need offers on different pages – so I have search them by entering some keywords – you understand, what I mean

You can use regular expressions to do that. The WPF application allows this to be done. To find pages with keywords, type keywords separated by the vertical line | into the regular expression box.

Hi, I am testing this software and it is wonderful and promising! However, I do not just handle well the topic of Regular Expressions with WebCrawlerAndScraper. Is possible a little hint about what the example expression (”/ ^ 0 \ d {4} \ s \ d {6} $ /”)? I don’t see anything related to the topic in the sw documentation.

Hello,

The regular expression entered is for phone numbers. Generally, regular expressions are a mechanism that allows matching of more complex patterns on a page. This is an advanced functionality that is used for matching arbitrary content (with the limitation to regular language).

The reason regular expression syntax is not included in the documentation is that regular expressions are a known tool for text matching. If you want to familiarize yourself with regular expressions, you can visit regular-expressions.info where there is an extensive documentation on the topic.

The Crawler allows you to use regular expressions to match specific types of content on the webpages you are crawling.

I’ll prove it. Thank you!

i would like to search for magento installations for a specific country domain extension, can i use your software? thanks in advance

You can search a list of webpages for any regular expression, which includes domain extensions in links on a specific webpage.

Hi!I would like to search for newly uploaded keywords for the all entire web.I search for auction houses or galleries who sell the product i want.(for example painters etc)

Unfortunately, this program cannot help you do that. It is meant for scarping specific sites/webpages for specific content.

ok i am sorry to hear that.Can you help me to do this program i need?A pay for you of course.

Hello, Can your scraper extract emails from all sub pages of the website and how about pdf files?

Can it also extract phone numbers from all sub pages and especially in the directory sites?

Lastly do you know any scraper that can extract emails from members of any facebook pages/groups?

I will be happy to hear from you soon and am going to buy your scraper

Thank you Ronnie

XIO

XIO Author

The scraper can find all emails and files of any extension on a page and subpages. However, I am not sure about the directory sites (I haven’t tested that). Generally speaking, if it is HTML, it should work.

I do not know whether you can achieve extracting emails from Facebook users. This might be doable for users who have their e-mail address public.

How would I use this tool to extract PDF documents via keyword throughout the web?

XIO

XIO Author

You can scrape a given URL or domain for a particular type of file.

If I understand your question correctly, you want something like a search engine to search the web for PDF files with specific keywords. This program cannot be used for that purpose.

However, you could use Google Advanced Search for it: https://www.google.com/advanced_search

So I could point app to scribd.com or other document sharing sites and use reg expression to grab KW only PDF’s? How would I do that?

XIO

XIO Author

You can only download all files of a specific type. You cannot search by filename keyword.

by
by
by
by
by
by