ScrapeNinja

FREEMIUM
(Ким) Anthony | Оновлено 5 days ago | Data
Популярність

9.9 / 10

Затримки

2,898ms

Рівень обслуговування

98%

Health Check

N/A

Повернутися до всіх обговорень

escape character, source code backslashed

Rapid account: Alvro Lopez
AlvroLopez
2 years ago

Every time I make a ScrapeNinja request I get a “backslashed” response, a response having \ behind every quotation mark and behind any other backslash. This makes it tricky for parsing as, say, Scrapy selectors won’t find the right attributes and adding backslashes won’t fix anything: Python will read backslash as an escape character, not as a regular character. Adding two backslashes doesn’t work and using regex (which also uses \ as an escape character) is a mess since you can easily have 5 backslashes for just a newline. A brief example I just got from an e-commerce web:

<a class=“LEVEL_3” id=“Nav_W1101_0” href="/compra-online/bebidas/agua-soda-y-gaseosas/c/W1101" title=“Agua, Soda y Gaseosas”>Agua, Soda y Gaseosas</a>\n<ul class=“LEVEL_3”>\n</ul>\n</div>\n

Is there any way I can get the right source code? str.replace("", “”) usually destroys useful information

Rapid account: Restyler
restyler Commented 2 years ago

Hi!
I think now we are also dealing with RapidAPI text formatting and escaping 😃
ScrapeNinja serializes the request to JSON, so it needs to escape certain characters, but when you parse this JSON, the HTML should be back to its original form… are you saying that you have issues with escaping AFTER you already have parsed the returned JSON object and retrieved the .body property of the ScrapeNinja response?

https://youtu.be/UHOY-LubMsM?t=210

Приєднуйтесь до обговорення — додайте повідомлення нижче:

Вхід / Реєстрація, щоб публікувати нові повідомлення