Replies: 1 comment
-
ok please update to the new version |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hi so i am starting out with this project to scrape some data from the following website: jumbo.com. however, I am not getting the response. the code is basically this tutorial and only adding headless: False and changing both the link and prompt.
This however does not generate the expected response. instead of an expected list of product categories and weblinks i get:
--- Executing Fetch Node ---
--- (Fetching HTML from: https://www.jumbo.com/producten/) ---
--- Executing Parse Node ---
--- Executing GenerateAnswer Node ---
Processing chunks: 0%| | 0/1 [00:28<?, ?it/s]
Scraper Result: {'type': 'accordion', 'title': 'Openingstijden', 'content': 'https://www.jumbo.com/winkels'}
exec_info: [{'node_name': 'Fetch', 'total_tokens': 0, 'prompt_tokens': 0, 'completion_tokens': 0, 'successful_requests': 0, 'total_cost_USD': 0.0, 'exec_time': 83.95596218109131}, {'node_name': 'Parse', 'total_tokens': 0, 'prompt_tokens': 0, 'completion_tokens': 0, 'successful_requests': 0, 'total_cost_USD': 0.0, 'exec_time': 0.00398707389831543}, {'node_name': 'GenerateAnswer', 'total_tokens': 0, 'prompt_tokens': 0, 'completion_tokens': 0, 'successful_requests': 0, 'total_cost_USD': 0.0, 'exec_time': 28.795868158340454}, {'node_name': 'TOTAL RESULT', 'total_tokens': 0, 'prompt_tokens': 0, 'completion_tokens': 0, 'successful_requests': 0, 'total_cost_USD': 0.0, 'exec_time': 112.75581741333008}]
I also tested the tutorial itself (aka the original prompt and link) which only results in one video with title:
Scraper Result: {'type': 'video', 'title': 'Tech Support: Pyrotechnician Answers Fireworks Questions From Twitter', 'description': 'WIRED is where tomorrow is realized. It is the essential source of information and ideas that make sense of a world in constant transformation.', 'url': 'https://www.wired.com/video/watch/tech-support-pyrotechnician-answers-fireworks-questions-from-twitter'} with a similar exec info with 0 tokens.
What am i doing wrong? Is the code incorrect or is my llm setup not working or what?
PS: from similar discussion i found out it might be due to blockers, so i tried other sites, including wikipedea. however the results were still not matching the prompt or the tutorial's. additionally these blockers should theoretically be circumvented using the proxy and headless: False right?
Beta Was this translation helpful? Give feedback.
All reactions