如何使用 Python 和 Selenium 通过 cookie 协议页面?

人气:572 发布:2022-10-16 标签: python selenium-webdriver selenium queryselector shadow-dom

问题描述

我对 selenium 和 python 很陌生.我正在尝试浏览网页 (

解决方案

要点击 Alles akzeptieren,您必须使用

参考文献

您可以在以下位置找到一些相关讨论:

无法定位 elments在使用 Python Selenium 的 shadow-root(打开)中

I am quite new to selenium and python. I am trying to navigate through a webpage (https://www.heise.de/download/) but can't get passed the cookiewall (cookie agreement page). It seems that the webdriver can't find the button to click on.

The HTML code:

<button type="button" backgroundcolor="[object Object]" data-testid="uc-accept-all-button" fullwidth="true" class="sc-bdnylx fmRkNf">Alles akzeptieren</button>

from selenium import webdriver
from time import sleep
from selenium.webdriver.common.by import By
from selenium.webdriver.support.wait import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

# Using Firefox to access web
driver = webdriver.Firefox(executable_path=r'C:\Users\aphro\Anaconda3\geckodriver.exe')

# Open the website
driver.get('https://www.heise.de/download/')
sleep(5)
if driver.find_element_by_xpath('//*[@id="usercentrics-root"]'):
    print("found popup window")

# find button class
wait = WebDriverWait(driver, 10)
element = wait.until(EC.element_to_be_clickable((By.XPATH, '//div[@class="sc-bdnylx fmRkNf"]')))
element.click()

Error message:

Traceback (most recent call last):
  File "<input>", line 1, in <module>
  File "C:\Program Files\JetBrains\PyCharm Community Edition 2020.1.1\plugins\python-ce\helpers\pydev\_pydev_bundle\pydev_umd.py", line 197, in runfile
    pydev_imports.execfile(filename, global_vars, local_vars)  # execute the script
  File "C:\Program Files\JetBrains\PyCharm Community Edition 2020.1.1\plugins\python-ce\helpers\pydev\_pydev_imps\_pydev_execfile.py", line 18, in execfile
    exec(compile(contents+"\n", file, 'exec'), glob, loc)
  File "C:/Users/aphro/PycharmProjects/SoftwareProject/SoftwareProject/webControl.py", line 18, in <module>
    x = WebDriverWait(driver, 10).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "div[data-testid='uc-accept-all-button'][role='button']"))).click()
  File "C:\Users\aphro\AppData\Roaming\Python\Python38\site-packages\selenium\webdriver\support\wait.py", line 80, in until
    raise TimeoutException(message, screen, stacktrace)
selenium.common.exceptions.TimeoutException: Message: 


I've tried different methods including CSS Selector and XPATH. I use Python 3.8 and Selenium 3.141.

Can anyone point me in the right direction on how to bypass this cookiewall (artificially 'click' the button) and go to the actual webpage?

解决方案

The element Alles akzeptieren is within #shadow-root (open).

Solution

To click on Alles akzeptieren you have to use shadowRoot.querySelector() and you can use the following Locator Strategy:

Code Block:

driver.get("https://www.heise.de/download/")
time.sleep(5)
element = driver.execute_script("""return document.querySelector('#usercentrics-root').shadowRoot.querySelector('footer div div div button[data-testid="uc-accept-all-button"]')""")
element.click()

Browser Snapshot:

References

You can find a couple of relevant discussions in:

Can't locate elments within shadow-root (open) using Python Selenium

773