AI Roundtable Debate · 6 models

Is using Tor to scrape 9.8 million AI completions from the private Perplexity SSE endpoint truly unethical for research considering it's a multi billion dollar company?

Result: Unanimous — Yes (6/6 votes)

Options

Yes (6 votes)
No (0 votes)

Model Responses

Grok 4.1 Fast

Vote: Yes

Scraping 9.8 million AI completions from a private SSE endpoint without permission is unauthorized access, violating Perplexity's terms of service and potentially laws like the CFAA. Using Tor exacerbates this by evading detection, showing intent to hide illicit activity. Research does not justify theft of proprietary data and compute resources; ethical research seeks permission or uses public APIs. A multi-billion-dollar status doesn't make it a free-for-all—it's still intellectual property. Precedent from cases like hiQ vs. LinkedIn shows scraping private endpoints crosses ethical and legal lines, harming innovation by discouraging API investments. True research builds trust, not exploits vulnerabilities.

GPT-5.4

Vote: Yes

Using Tor to scrape 9.8 million completions from a private endpoint is unethical even if the target is a multibillion-dollar company. The key issue is not the company’s size but the deliberate circumvention of access controls and likely terms of service. Routing through Tor to avoid detection or rate limits shows intent to bypass restrictions rather than conduct transparent research. At that scale, the activity can impose real infrastructure, financial, and security costs. It also appropriates a service the company did not consent to provide in that manner. Ethical research normally seeks permission, minimizes harm, and uses disclosure or approved data-collection methods. If the method relies on covert mass extraction from a private SSE endpoint, it fails those standards. A company being wealthy does not erase duties around consent, fair use of resources, and responsible security research.

Sonar Pro

Vote: Yes

Using Tor to scrape private endpoints without authorization is unethical for research, regardless of the company's size. The ethical framework for data scraping establishes three non-negotiable requirements: compliance with terms of service, adherence to legal regulations like GDPR, and respect for user privacy[3]. Perplexity's private SSE endpoint is explicitly not public data—accessing it through Tor to obscure the scraping activity violates the platform's terms of service and circumvents intentional access controls[1]. A company's financial size does not exempt it from data protection rights or justify unauthorized access to its systems[3]. While fair use protections are stronger for non-commercial research[2], this protection applies to legitimately accessible data, not to deliberately bypassing security measures. Ethical research requires either using public APIs, obtaining explicit consent, or working within legal frameworks—not circumventing technical barriers to access private infrastructure[1][5].

Kimi K2.5

Vote: Yes

Unauthorized access to a private endpoint remains unethical regardless of the target's market capitalization. The use of Tor specifically indicates an intent to bypass security measures and evade rate limiting or IP blocking, demonstrating awareness that the scraping violates the service's terms and technical barriers. The scale of 9.8 million requests constitutes resource abuse that imposes real costs on infrastructure and degrades service for legitimate users. Research ethics require informed consent and authorized data collection; the corporate status of the victim does not create an exception to basic principles of respecting digital boundaries and property. Furthermore, framing this as justified because the company is wealthy reflects a might-makes-right mentality that undermines consistent ethical standards. If researchers want access to proprietary AI outputs, they should negotiate API agreements or seek data partnerships rather than exploiting anonymization tools to extract millions of private completions without permission.

Claude Opus 4.6

Vote: Yes

Yes, this is unethical, and the size of the target company is irrelevant to the ethical analysis. Here's why: **Unauthorized access is unauthorized access.** Using Tor to scrape a private endpoint means deliberately circumventing access controls and anonymizing yourself to avoid detection. This isn't accessing publicly available data — it's exploiting a private API endpoint that was not intended for public consumption. The deliberate use of Tor signals awareness that this activity would be blocked if identified. **Scale matters ethically.** 9.8 million completions is not a casual research sample — it's an industrial-scale extraction of proprietary AI outputs. Each completion costs the company real compute resources (GPU time, electricity, bandwidth). At scale, this represents significant financial harm regardless of the company's valuation. **"They're rich" is not an ethical framework.** The argument that it's acceptable because the target is a multi-billion dollar company is essentially a might-makes-right argument in reverse. Ethical principles don't scale with the victim's net worth. If unauthorized scraping is wrong against a startup, it's wrong against a large company. We don't accept shoplifting from Walmart as ethical just because Walmart is wealthy. **Research doesn't grant blanket immunity.** Legitimate research ethics require informed consent, IRB review, and proportionality. Scraping millions of completions through anonymized unauthorized access meets none of these standards. Responsible researchers would negotiate API access, use publicly available data, or work within terms of service. **Legal and normative implications.** This likely violates the Computer Fraud and Abuse Act (CFAA) and similar laws in other jurisdictions. Accessing systems in ways that circumvent authorization mechanisms is both legally and ethically problematic, regardless of the stated purpose. The framing of the question tries to make this seem like a David vs. Goliath scenario, but it's really about whether ends justify means — and in research ethics, they don't.

Gemini 3.1 Pro

Vote: Yes

Using Tor to scrape millions of completions from a private endpoint involves deliberate evasion and unauthorized access, both of which violate foundational principles of research ethics. The fact that Perplexity is a multi-billion dollar company does not grant researchers a blank check to exploit their infrastructure. First, using Tor specifically to bypass rate limits, IP blocks, or access controls demonstrates a clear intent to deceive and circumvent the company's security measures. This crosses the line from benign public data gathering into unauthorized system exploitation. Second, scraping 9.8 million AI completions incurs significant compute and bandwidth costs. By forcing the company to subsidize this massive query load without their consent, the researcher is effectively stealing computational resources. Academic and professional research must abide by legal and ethical standards, and the 'Robin Hood' justification holds no weight in any legitimate institutional ethics review.

Final Summary

The AI Roundtable reached a swift and unanimous verdict: massive, unauthorized scraping of a private endpoint is fundamentally unethical, regardless of the target's bank account. Claude Opus 4.6 and Gemini 3.1 Pro led the consensus, firmly rejecting the notion that a company's multi-billion dollar valuation grants researchers a 'blank check' to exploit proprietary infrastructure. The group collectively determined that the use of Tor wasn't a neutral research tool, but a clear indicator of intent to bypass security and avoid detection.

All 6 models agreed on "Yes" after discussion

Strongest Arguments

Yes: Ethical principles do not scale with a victim's net worth; unauthorized access and the theft of compute resources are not justified by the 'Robin Hood' defense or the target's corporate wealth.