Recent research has unveiled that free apps are quietly transforming smart TVs into data proxies for artificial intelligence (AI) data scraping. The study, conducted by Include Security and independent researcher Buchodi, highlights how Bright Data’s iOS Software Development Kit (SDK) is embedded in consumer applications, turning devices like smart TVs into exit nodes for web-scraping traffic, which Bright Data markets to the AI industry.
Expanding the Proxy Network
Bright Data, formerly known as Luminati, operates what it claims to be the largest residential proxy network globally, boasting over 400 million residential IP addresses. This extensive network is partly maintained through the SDK, which is included in free apps under the guise of consent-based usage. The SDK is said to offer a pool of over 150 million IP addresses.
Published on June 5, the findings underscore the significance of this practice, as it shifts data scraping operations from professional data centers to personal home networks. The immediate concern is not unauthorized access to accounts, but rather the use of home internet connections and bandwidth for external data collection purposes.
Technical Details and Risks
The research revealed that the iOS SDK, when activated, communicates with Bright Data servers without verifying the requester’s identity. This allows the server to direct the device to retrieve web pages using the user’s home internet connection, creating a potential security vulnerability. The peer channel facilitating these operations lacks robust security checks, making it comparable to malware in terms of security weaknesses.
Notably, on iOS devices, this traffic can bypass configured VPNs, continuing in the background even when the user is unaware. This raises privacy concerns, especially since the SDK’s opt-in screen does not accurately reflect its extensive capabilities.
Consent and Ethical Concerns
The SDK’s opt-in consent is often misleading, as seen in apps like Petflix on Roku, which claim occasional use while allowing up to 200 GB of monthly traffic. In certain regions, such as Uzbekistan and Oman, usage caps are even higher. Additionally, the SDK can link multiple devices under one user profile, amplifying data collection capabilities.
Bright Data lists its app partners publicly, which include smart-TV app developers like PlayWorks Digital and CloudTV. However, inclusion on this list does not confirm current SDK usage, necessitating individual app verification.
Addressing the Issue
The practice of using residential connections for AI data scraping is not new, but its scale is increasing due to demand from AI companies. While traditional anti-bot measures block data center IPs, residential connections remain vulnerable. Recent actions by tech giants such as Google, Amazon, and Roku have restricted background proxy SDKs, prompting Bright Data to pivot towards platforms like Samsung’s Tizen and LG’s webOS.
To mitigate these risks, users can block specific web addresses used by the SDK at the router level using tools like Pi-hole or NextDNS. These measures prevent devices from acting as data relays without disrupting Bright Data’s paid services. Organizations managing mobile devices should also monitor for apps containing the SDK, although mobile data usage may bypass network-level blocks.
