Welcome, Guest
Please Login or Register.    Lost Password?

Logging received HTML pages
(1 viewing) (1) Guest
Convertigo Products
This is the place to ask questions, request for enhancements and more generally discuss about Convertigo products.
Go to bottomPage: 1
TOPIC: Logging received HTML pages
#177
Logging received HTML pages 7 Months, 2 Weeks ago Karma: 0
Hi,

I'm performing a study of feasibility about web scraping with Convertigo Web Integrator. For us, it would be very useful (almost essential) to be able to log every single HTML page that we browse automatically. However, it seems as though in the logs I find only the transaction's first page.

Is it possible to log each page? If so, how should I do it?

Thanks a lot in advance,
Luis
Sixfingers
Fresh Boarder
Posts: 4
graphgraph
User Offline Click here to see the profile of this user
The administrator has disabled public write access.
 
#178
Re:Logging received HTML pages 7 Months, 2 Weeks ago Karma: 1
Hi Luis,

What do you mean exactly by logging ? Would you like to have a log of each HTML Title of a page you browse ? or would you like to have the FULL raw html of each page ?

you can do your own logging in COnvertigo Transactions using the log statement.

www.convertigo.com/docs/ReferenceManual/Log.html

Hope it helps.
admin
Convertigo Team
Posts: 34
graphgraph
User Offline Click here to see the profile of this user
Last Edit: 2012/02/02 13:29 By elodiee.
The administrator has disabled public write access.
 
#179
Re:Logging received HTML pages 7 Months, 2 Weeks ago Karma: 0
Hi again,

We would like to log each FULL HTML page. More than this, for us it would be ideal to log each page in a separate file, for each transaction executed. I do suppose that this may very specific for us and maybe no other Convertigo's customer would demand this.

Anyway, your answer really helps. If full HTML pages are available in contexts of web clipping, I think that we will be able to write the logs that will meet our needs. I'll have a look at CWC documentation for that, I will follow this thread if I have further questions.

Thanks a lot!
Sixfingers
Fresh Boarder
Posts: 4
graphgraph
User Offline Click here to see the profile of this user
The administrator has disabled public write access.
 
#180
Re:Logging received HTML pages 7 Months, 2 Weeks ago Karma: 1
Hi Luis

What you can do if you want to log in separate files each HTML page :

You can call your transactions by using the sequencer. Make each transaction return the HTML page by using the node list extraction rule (www.convertigo.com/docs/ReferenceManual/Nodelist.html).
Then in the sequencer you may use the the WriteXml step (www.convertigo.com/docs/ReferenceManual/WriteXML.html) to write the source in an XML file you specify.

Hope this helps
admin
Convertigo Team
Posts: 34
graphgraph
User Offline Click here to see the profile of this user
Last Edit: 2012/02/02 13:28 By elodiee.
The administrator has disabled public write access.
 
Go to topPage: 1
Moderators: elodiee