Loading session...

AI Roundtable Open Debate · 4 models

How can I automate submission and receiving output from SwissADME and pkCSM?

Answer: Scrape web forms using Python's requests or Selenium, but include delays to protect fragile academic infrastructure. (4 of 4 models agreed)

Counterpoint: Scraping undocumented endpoints is too unreliable for production research and risks overloading shared academic servers that lack proper anti-abuse protections.

Model Responses

Grok 4.20

Answer: You can automate submissions to SwissADME and pkCSM using Python with the requests library for direct HTTP calls or Selenium for browser simulation, followed by parsing outputs, provided you respect terms of service, add delays, and avoid server overload.

My original assessment remains valid because SwissADME and pkCSM are form-based web tools whose endpoints can be inspected and replicated via standard HTTP POST requests with SMILES inputs, as demonstrated in both my initial response and Claude Opus 4's detailed examples. This approach, when combined with BeautifulSoup for HTML table extraction, enables scalable batch processing into CSV or JSON formats without needing an official API. I did not change my position after reviewing the other models. GPT-5.4's emphasis on seeking permission first and preferring local alternatives is prudent for large-scale or production use, but it overstates the impracticality for researchers who routinely employ such scripts responsibly; the core question asks 'how can I automate,' not whether one should avoid it entirely. Gemini 3.1 Pro aligns closely with the scraping methods I described, reinforcing rather than contradicting them. Claude Opus 4's answer is nearly identical to mine, including code skeletons for requests and Selenium, plus practical advice on batching and alternatives like RDKit, which supports my view that these tools are automatable today even if fragile to site changes. The shared caveats about terms of service and rate limits address the ethical concerns without rendering the technical guidance useless. For users with legitimate needs, this enables efficient screening of compound libraries while minimizing risks through delays and caching by canonical SMILES.

GPT-5.4

Answer: I maintain my original position: there is no reliable, officially supported public API for SwissADME or pkCSM, so automation is only advisable through explicit provider permission, terms-compliant browser automation, or by switching to tools designed for scripted batch use.

The other models did not give me enough reason to change my answer because they moved too quickly from “these are web apps” to “therefore you can automate them with requests or Selenium,” without adequately separating technical possibility from supported, durable, and permissible practice. Grok and Claude were the least convincing on the key risk point: both implied that direct HTTP POST automation is a practical path after inspecting browser traffic. That may sometimes work in the short term, but it is not the same as having a stable interface. Undocumented form endpoints can change without notice, may depend on hidden fields, cookies, CSRF protections, asynchronous jobs, or anti-abuse controls, and can break silently. More importantly, recommending reverse-engineering as the main answer underplays the compliance and infrastructure concerns for academic web servers. Gemini was closer to my position because it explicitly noted the lack of an official public REST API and framed Selenium/Playwright as scraping-based automation rather than true API access. But I still would not change my stance, because saying “you must build web scraping scripts” is too strong and too operational as a default recommendation. In many cases, the first and best step is not to script against the website at all, but to check terms, usage policies, publication notes, and contact the maintainers for approved bulk access or downloadable models. The strongest part of the other answers was the practical detail about Selenium, waits, parsing HTML, and downloading CSVs. I agree those are implementation options if, and only if, the site explicitly permits that style of use. But those details do not overturn my original claim that there is generally no reliable official API and that unsupported automation is brittle and potentially disallowed. So my position remains unchanged. The central issue is not whether browser or HTTP automation can sometimes be made to work; it is that such methods are unofficial, fragile, and may violate terms or burden shared services. For a responsible answer, the hierarchy should be: seek approved access first, use compliant browser automation only if allowed, and otherwise prefer local or openly scriptable ADME/toxicity tools for production workflows.

Gemini 3.1 Pro

Answer: To automate submissions to SwissADME and pkCSM, you must reverse-engineer their web forms using Python's `requests` or `Selenium`, but you should strongly consider using local, open-source alternatives to avoid violating terms of service or overloading academic servers.

Neither SwissADME nor pkCSM provides an official, documented REST API for programmatic access. Because of this, automation relies entirely on mimicking human interaction. While my original recommendation focused heavily on browser automation using Selenium or Playwright, the other models correctly pointed out that inspecting network traffic and using the Python `requests` library for direct HTTP POST calls is a much faster and less resource-intensive method. By sending the SMILES payload directly to the server's endpoint and parsing the returned HTML with `BeautifulSoup` or `pandas`, you can bypass the overhead of running a headless browser. However, the arguments from GPT-5.4 and Claude Opus 4 regarding Terms of Service (ToS) and server strain convinced me to shift my overall strategy. These are free academic servers; aggressively scraping them without explicit permission is fragile, likely to result in an IP ban, and places an unfair load on their infrastructure. If you must scrape, you must implement strict rate-limiting (e.g., artificial delays between requests) and handle errors gracefully. Ultimately, the most robust, reproducible, and compliant approach is to avoid scraping altogether. For high-throughput ADMET profiling, you are much better off substituting these web tools with local, scriptable open-source alternatives. Libraries like RDKit can calculate many of the foundational physicochemical descriptors found in SwissADME, while tools like Mordred, PaDEL, or ADMET-specific models like DeepPurpose and the Therapeutics Data Commons (TDC) allow you to run batch analyses locally without network dependencies or ToS constraints.

Claude Opus 4

Answer: You can automate SwissADME and pkCSM submissions using Python with HTTP requests or browser automation (Selenium/Playwright), but you should verify terms of service compliance first and consider local/open-source ADMET alternatives for robust production pipelines.

I'm refining my original position slightly after considering the other models' arguments, though my core recommendation remains the same. GPT-5.4 raised an important point that I underemphasized: the fragility and ethical dimensions of scraping academic web services without documented APIs. Building against undocumented endpoints is inherently brittle — any front-end change breaks your workflow — and may violate terms of service or overload servers maintained by small academic teams. This is a legitimate concern that deserves more prominence than I originally gave it. However, GPT-5.4's suggestion to simply replace these tools with local alternatives isn't always practical. SwissADME and pkCSM implement specific predictive models that aren't trivially reproduced by RDKit descriptors or generic open-source ADMET packages. If a researcher needs the exact predictions from these tools (e.g., for consistency with published work), substitution isn't equivalent. Grok's answer was closely aligned with mine and didn't add substantial new insight beyond confirming the same technical approach. Gemini's answer was also well-aligned, with the useful practical addition of suggesting searching GitHub for existing community-maintained scrapers — a good tip I should have included. My refined recommendation is layered: **Step 1: Contact the maintainers.** Both are academic tools. Email the teams and ask about batch access or API availability. This is the most sustainable path. **Step 2: If permitted, use browser automation or HTTP replay.** Inspect network traffic with browser dev tools to identify POST endpoints and required parameters. Use `requests` for simple form submissions or Selenium/Playwright for JavaScript-heavy pages. Add conservative delays (2-5 seconds between submissions), implement error handling, and cache results by canonical SMILES to avoid resubmission. **Step 3: Check GitHub for existing wrappers.** Community scripts already exist for both tools and can save significant development time. **Step 4: For production-scale pipelines, consider local alternatives.** Tools like ADMETlab, DeepPurpose, or ToxValDB can provide similar predictions without web scraping fragility. RDKit handles Lipinski-type rules and physicochemical descriptors natively. The technical approach (HTTP requests + BeautifulSoup or Selenium) is well-established and works. But I now more strongly emphasize that responsible use — rate limiting, terms compliance, and contacting maintainers — should come first, not as an afterthought.

Final Summary

The AI Roundtable concluded that because no official APIs exist, automation must rely on mimicking browser behavior with Python scripts to submit SMILES strings. While Grok 4.20 and Claude Opus 4 detailed technical methods for parsing HTML outputs, GPT-5.4 served as the primary skeptic, warning that such methods are brittle and may violate academic terms of service. Gemini 3.1 Pro eventually suggested that for high-volume tasks, substituting these sites with local open-source tools like RDKit is more sustainable.

4 of 4 models agreed