_______               __                   _______
       |   |   |.---.-..----.|  |--..-----..----. |    |  |.-----..--.--.--..-----.
       |       ||  _  ||  __||    < |  -__||   _| |       ||  -__||  |  |  ||__ --|
       |___|___||___._||____||__|__||_____||__|   |__|____||_____||________||_____|
                                                             on Gopher (inofficial)
   URI Visit Hacker News on the Web
       
       
       COMMENT PAGE FOR:
   URI   Disrupting the largest residential proxy network
       
       
        niedbalski wrote 1 hour 50 min ago:
        Thanks google for saving us. I guess this is the equivalent of rival
        narcos fighting each other.
       
        AugustoCAS wrote 3 hours 58 min ago:
        This was easy because it's a Chinese company.
        
        The largest companies in this space that do similar this (oxylabs,
        brighdata,etc) have similar tactics but are based in a different
        location.
       
        edg5000 wrote 6 hours 36 min ago:
        Residential proxies are the only way to crawl and scrape. It's ironic
        for this article to come from the biggest scraping company that ever
        existed!
        
        If you crawl at 1Hz per crawled IP, no reasonable server would suffer
        from this. It's the few bad apples (impatient people who don't rate
        limit) who ruin the internet for both users and hosters alike. And then
        there's Google.
       
          Ronsenshi wrote 4 hours 6 min ago:
          One thing about Google is that many anti-scraping services explicitly
          allow access to Google and maybe couple of other search engines.
          Everybody else gets to enjoy CloudFlare captcha, even when doing
          crawling at reasonable speeds.
          
          Rules For Thee but Not for Me
       
            ehhthing wrote 2 hours 54 min ago:
            You say this like robots.txt doesn't exist.
       
            chii wrote 3 hours 48 min ago:
            > many anti-scraping services explicitly allow access to Google and
            maybe couple of other search engines.
            
            because google (and the couple of other search engines) provide
            enough value that offset the crawler's resource consumption.
       
              JasonADrury wrote 39 min ago:
              That's cool, but it's impossible for anyone to ever build a
              competitor that'd replace google without bypassing such services.
       
          BatteryMountain wrote 5 hours 21 min ago:
          Saying the quiet part out loud...Shhhs
       
        brikym wrote 6 hours 39 min ago:
        I'll betcha Google uses a lot of residential proxies themselves to
        scrape data and don't want competitors doing it.
       
        walletdrainer wrote 6 hours 50 min ago:
        It’s interesting that when Luminati, an Israeli company, does this,
        it’s fine.
        
        When the Chinese do this? Very bad.
       
          VladVladikoff wrote 6 hours 46 min ago:
          They are both bad. You are showing your own bias.
       
            calgoo wrote 2 hours 11 min ago:
            No, he is referencing Google going after the Chinese company, not
            the Israel based one. That does not mean there is bias with the
            commenter at all, just that the companies operate differently and
            are treated differently. The country of origin is important as
            Israel based companies are more integrated into the western
            business world, and tend to at least try to show an effort in
            keeping spam and other things off their platforms. 
            Now I do agree that they are both bad companies that should not be
            allowed to operate the way they do. I would say the same thing
            about the other 1000 scrapers hitting websites everyday as well
            (including Google).
            
            What they did not comment directly on, is how many apps / games
            they might have actually removed from the Playstore with the
            removal of the SDKs, which would be the actual interesting data.
       
              JasonADrury wrote 52 min ago:
              FWIW a couple of years ago I was involved in a court case where
              there was a subpoena sent to Luminati to figure out whether or
              not a specific request had originated from their network, lawyers
              Luminati replied that they do not keep any logs whatsoever as
              they aren't required to do so under Israeli law.
              
              Hard to imagine any serious anti-abuse efforts by Luminati if
              they don't monitor what their users are doing, but this is
              probably a deliberate effort to avoid potential liability arising
              from knowing what their users are doing.
       
            walletdrainer wrote 5 hours 28 min ago:
            Personally, I don’t think either of them are actually
            meaningfully bad. A bit naughty, maybe?
            
            I do think the disparity in attention is fascinating. These new
            Chinese players have been getting nonstop press while everyone
            ignores the established giant.
       
        ExpertAdvisor01 wrote 8 hours 59 min ago:
        Of course brightdata doesn't get touched.
       
        chatmasta wrote 9 hours 46 min ago:
        Why are they leaving Bright Data (aka Illuminati aka Hola VPN)
        untouched? They are doing this exact scheme on an industrial scale.
       
          7thpower wrote 8 hours 21 min ago:
          They have a robust KYC that appears to serve, at least in large part,
          as a way to stay off the shit list of companies with the resources to
          pursue recourse.
          
          Source: went through that process, ended up going a different route.
          The rep was refreshingly transparent about where they get the data,
          why the have the kyc process (aside from regulatory compliance).
          
          Ended up going with a different provider who has been cheaper and
          very reliable, so no complaints.
       
            walletdrainer wrote 6 hours 46 min ago:
            I’ve certainly never been asked to do KYC with Luminati after
            using them for hundreds of terabytes over the years.
            
            It’s not like I’m using some bigco email address or given them
            any other reason to skip KYC either.
       
              ghxst wrote 3 hours 48 min ago:
              They do KYC when you want to unblock certain domains.
       
                walletdrainer wrote 3 hours 37 min ago:
                Also not my experience, even though I’ve had to email them
                for whitelisting.
                
                It might just be because my account is very old?
       
            chatmasta wrote 7 hours 39 min ago:
            Yeah, they make you do a Skype interview (or probably Zoom
            interview nowadays). You could call this KYC or collateral,
            depending on your view of the company. It does limit the
            nefariousness of their clientele but I doubt they do much, or any,
            monitoring of actual traffic after onboarding (not for compliance
            reasons, anyway).
       
        IhateAI wrote 10 hours 42 min ago:
        How do you stop mobile proxies operating through similar nefarious
        business models... CGNAT prevents you from easily identifying the exit
        nodes.
       
          UqWBcuFx6NV4r wrote 8 hours 2 min ago:
          Working with network operators.
       
            Nextgrid wrote 7 hours 11 min ago:
            Network operators have zero reason to care, they get paid per the
            GB for the bandwidth.
       
        progbits wrote 13 hours 23 min ago:
        I'm surprised by the negative takes...
        
        Yes, proxies are good. Ones which you pay for and which are running
        legitimately, with the knowledge (and compensation) of those who run
        them.
        
        Malware in random apps running on your device without your knowledge is
        bad.
       
          vlovich123 wrote 6 hours 15 min ago:
          > Some users may knowingly install this software on their devices,
          lured by the promise of “monetizing” their spare bandwidth.
          
          Sounds like they’re targeting networks even if the users are ok
          participating in, precisely what you’re saying is ok.
          
          As for malware enrolling people into the network, it depends if the
          operator is doing it or if the malware is 3rd parties trying to get a
          portion of the cash flow. In the latter case the network would be the
          victim that’s double victimized by Google also attacking them.
       
            xhcuvuvyc wrote 5 hours 23 min ago:
            > These SDKs, which are offered to developers across multiple
            mobile and desktop platforms, surreptitiously enroll user devices
            into the IPIDEA network.
            
            ?
       
            wmf wrote 5 hours 49 min ago:
            Users are OK with acting as proxies because they don't understand
            all the shady stuff their proxy is being used for. Also consumer
            ISPs generally ban this.
       
              JasonADrury wrote 41 min ago:
              Why would the users care either way?
       
                jraph wrote 3 min ago:
                Some people care about ethics, and try to avoid doing bad
                stuff.
       
              iammrpayments wrote 3 hours 21 min ago:
              You could say the same about google’s terms of service.
       
                BrenBarn wrote 2 hours 1 min ago:
                A thousand times yes.
       
              chii wrote 3 hours 50 min ago:
              But then would you make the same arguments for running a tor node
              (presumably, you don't know what shady stuff is there, but you
              know there's shady stuff)?
       
          throwoutway wrote 8 hours 51 min ago:
          > Malware in random apps running on your device without your
          knowledge is bad.
          
          And ones that have all the indicators of compromise of Russia, Iran,
          DPRK, PRC, etc
       
            bigiain wrote 6 hours 0 min ago:
            Am I the only one cynically thinking that "Russia, Iran, DPRK, PRC,
            etc" is the "But think of the chiiildren!!!" excuse for doing this?
            
            And when Google say
            
            "IPIDEA’s proxy infrastructure is a little-known component of the
            digital ecosystem leveraged by a wide array of bad actors."
            
            What they really mean is " ... leveraged by actors indiscriminately
            scraping the web and ignoring copyright - that are not us."
            
            I can't help but feel this is just Google trying to pull the ladder
            up behind then and make it more difficult for other companies to
            collect training data.
       
              shit_game wrote 1 hour 40 min ago:
              >I can't help but feel this is just Google trying to pull the
              ladder up behind then and make it more difficult for other
              companies to collect training data.
              
              I can very easily see this as being Google's reasoning for these
              actions, but let's not pretend that clandestine residential
              proxies aren't used for nefarious things. The vast majority of
              social media networks will ban - or more generally and insiously
              - shadow ban accounts/IPs that use known proxy IPs. This means
              that they are gating access to their platforms behind residential
              IPs (on top of their other various blackboxes and heuristics like
              fingerprinting). Operators of bot networks thus rely on
              residential proxy services to engage in their work, which ranges
              from mundane things like engagement farming to outright dangerous
              things like political astroturfing, sentiment manipulation, and
              propaganda dissemination.
              
              LLMs and generative image and video models have made the creation
              of biased and convincing content trivial and cheap, if not free.
              The days of "troll farms" is over, and now the greatest expense
              for a bad actor wishing to influence the world with fake
              engagement and biased opinions is their access to platforms,
              which means accounts and internet connections that aren't
              blacklisted or shadow banned. Account maturity and reputation
              farming is also feeling a massive boon due to these tools, but as
              an independent market it also similarly requires internet
              connections that aren't blacklisted or shadow banned. Residential
              proxies are the bottleneck for the vast majority of bad actors.
       
              Craighead wrote 1 hour 51 min ago:
              No, what they're saying is what they said, what you're implying
              reveals a strange bias. Web scraping through residential proxies?
              Please think through your thoughts more. There's much more
              effective and efficient ways to do so. Multiple bad actors, like
              ransomware affiliates, have been caught using residential proxy
              networks. But by all means, don't let facts and cyber threat
              intelligence get in the way.
       
          CodeMage wrote 10 hours 12 min ago:
          Getting rid of malware is good. A private for-profit company
          exercising its power over the Internet, not so much. We should have
          appropriate organizations for this.
       
            vachina wrote 7 hours 58 min ago:
            The proxies is the reason why you get spam in your Google search
            result, spam in your Play store (by means of fake good reviews),
            basically spam in anything user generated.
            
            It directly affects Google and you, I don’t see why they should
            not do this.
       
              Nextgrid wrote 7 hours 31 min ago:
              Spam in Google search results is due to Google happily taking
              money from the spammers in exchange for promoting their spam, or
              that the spam sites benefit Google indirectly by embedding Google
              Ads/Analytics.
              
              I don't see any spam in Kagi, so clearly there is a way to detect
              and filter it out. Google is simply not doing so because it would
              cut into their profits.
       
                miki123211 wrote 5 hours 16 min ago:
                The reason you don't see spam in Kagi is because nobody is
                targeting Kagi specifically.
                
                They can probably get away with a lot of stupid rules that
                would backfire if anybody tried to cater to them specifically.
       
                  Nextgrid wrote 5 hours 8 min ago:
                  "SEO spammers being more advanced than multi-billion-dollar
                  search conglomerate" is a myth. Spam sites have an obvious
                  objective: display ads, shill affiliate links or sell
                  products. All these have to be visible, since an ad or
                  product you can't see/buy is worthless. It is trivial to
                  train a classifier to detect these.
                  
                  But let's play devil's advocate and say you are right and
                  spammers are successfully outsmarting Google - well, Kagi
                  does use Google results via SerpAPI by their own admission,
                  meaning they too should have those spam results. Yet they
                  somehow manage to filter them out with a fraction of the
                  resources available to Google itself with no negative impact
                  on search quality.
       
            UqWBcuFx6NV4r wrote 8 hours 11 min ago:
            Okay. You get right on that. In the meantime, would you rather they
            did nothing? What do you actually want, in concrete terms?
       
          bdcravens wrote 12 hours 18 min ago:
          Many are "compensated" (in the way of software they didn't pay for),
          so the real question is that of disclosure (in which case many
          software vendors check the box in the most minimal way possible by
          including it as fine print during the install)
       
            happyopossum wrote 11 hours 28 min ago:
            No, the question is not just disclosure. People have their
            bandwidth stolen, and sometimes internet access revoked due to this
            kind of fraud and misuse - disclosure wouldn’t solve that
       
              bigfatkitten wrote 7 hours 14 min ago:
              If they're lucky. Sometimes people have their doors kicked in by
              armed police.
       
              the_fall wrote 11 hours 0 min ago:
              Also, as a website owner, these residential proxies are a real
              pain. Tons and tons of abusive traffic, including people trying
              to exploit vulnerabilities and patently broken crawlers that send
              insane numbers of requests, and no real way to block it.
              
              It's just nasty stuff. Intent matters, and if you're selling a
              service that's used only by the bad guys, you're a bad guy too.
              This is not some dual-use, maybe-we-should-accept-the-risks deal
              that you have with Tor.
       
        htx80nerd wrote 13 hours 24 min ago:
        nice to see in the comments how many people didnt even do a 30 second
        scan of the article before clicking `add comment`
       
        scirob wrote 14 hours 0 min ago:
        so that only google and anthropic are allowed to scrape the web. No one
        else may have workarounds
       
          a456463 wrote 13 hours 11 min ago:
          Exactly. This is just google building a "moat" around their shady
          business.
       
            cvalka wrote 8 hours 13 min ago:
            100%
       
        direwolf20 wrote 14 hours 24 min ago:
        All of this sounds legal, so on what basis did they get them shut down?
       
          SOTGO wrote 14 hours 1 min ago:
          I haven't looked at any court documents, but the WSJ article from
          Wednesday reported that "Last year, Google sued the anonymous
          operators of a network of more than 10 million internet-connected
          televisions, tablets and projectors, saying they had secretly
          pre-installed residential proxy software on them... an Ipidea
          spokeswoman acknowledged in an email that the company and its
          partners had engaged in “relatively aggressive market expansion
          strategies” and “conducted promotional activities in
          inappropriate venues (e.g., hacker forums)...”"
          
          There was also a botnet, Kimwolf, that apparently leveraged an
          exploit to use the residential proxy service, so it may be related to
          Ipidea not shutting them down.
       
            direwolf20 wrote 10 hours 16 min ago:
            Google does much worse in Google–branded devices and apps, like
            the wifi location data harvesting.
       
        londons_explore wrote 14 hours 36 min ago:
        We need more residential proxies, not less.
        
        I've had enough of companies saying "you're connecting from an AWS IP
        address, therefore you aren't allowed in, or must buy enterprise
        licensing".    Reddit is an example which totally blocks all data to
        non-residential IP's.
        
        I want exactly the same content visible no matter who you are or where
        you are connecting from, and a robust network of residential proxies is
        a stepping stone to achieving that.
       
          yuliyp wrote 10 hours 29 min ago:
          The end game of that is no useful content being accessible without
          login, or needing some sort of other proof-of-legitimacy.
       
            Nextgrid wrote 7 hours 19 min ago:
            That's already the case (irrespective of residential proxies)
            because content only serves as bait for someone to hand over
            personal information (during signup/login) and then engage with
            ads.
            
            Proxies actually help with that by facilitating mass account
            registration and scraping of the content without wasting a human's
            time "engaging" with ads.
       
            supertrope wrote 9 hours 55 min ago:
            Amazon.com now only shows you a few reviews. To see the rest you
            must login. Social media websites have long gated the carrots
            behind a login. Anandtech just took their ball and went home by
            going offline.
       
          nine_k wrote 11 hours 32 min ago:
          There's a company that pays you to keep their box connected to your
          residential router. I assume it sells residential proxy services,
          maybe also DDoS services, I don't know. It's aptly named Absurd
          Computing.
       
          crtasm wrote 12 hours 51 min ago:
          I'm reading reddit.com from a Tor node, they also have a .onion
          domain you could use.
       
            Jblx2 wrote 12 hours 1 min ago:
            Anyone know how to create a usable reddit account from the .onion
            domain?
       
              phyzome wrote 11 hours 54 min ago:
              I've tried it, and my account was shadowbanned a few hours after
              I created it. It's very obnoxious.
       
                cluckindan wrote 11 hours 31 min ago:
                Reddit bots shadowban almost everyone who post before they have
                enough comment karma. Nothing to do with Tor or VPN.
       
          a456463 wrote 13 hours 21 min ago:
          This blog post from the company that used promise "don't be evil",
          one that steals water for data centers from vilages and towns via
          shady deals, whose whole premise it stealing other people's stuff and
          claiming it as their own and locking them out and selling their
          data.. Who made them the arbiter of the internet? No one!!!
          
          They just stole this and get on their high horse to tell people how
          to use internet? You can eff right off Google.
       
          JDye wrote 13 hours 24 min ago:
          I live in the UK and can't view a large portion of the internet
          without having to submit my ID to _every_ site serving anything
          deemed "not safe the for the children". I had a question about a new
          piercing and couldn't get info on it from Reddit because of that. I
          try using a VPN and they're blocked too. Luckily, I work at a copmany
          selling proxies so I've got free proxies whenever I want, but I
          shouldn't _need_ to use them.
          
          I find it funny that companies like Reddit, who make their money
          entirely from content produced by users for free (which is also often
          sourced from other parts of the internet without permission), are so
          against their site being scraped that they have to objectively ruin
          the site for everyone using it. See the API changes and killing off
          of third party apps.
          
          Obviously, it's mostly for advertising purposes, but they love to
          talk about the load scraping puts on their site, even suing AI
          companies and SerpApi for it. If it's truly that bad, just offer a
          free API for the scrapers to use - or even an API that works out just
          slightly cheaper than using proxies...
          
          My ideal internet would look something like that, all content free
          and accessible to everyone.
       
            what wrote 7 hours 45 min ago:
            Have you considered that it’s because a new industry popped up
            that decided it was okay to slurp up the entire internet, repackage
            it, and resell it? Surely that couldn’t be why sites are trying
            to keep non humans out.
       
            201984 wrote 11 hours 22 min ago:
            Fix your government.
       
              JDye wrote 10 hours 52 min ago:
              Thanks lad. Will get right on it.
       
                ThePowerOfFuet wrote 3 hours 35 min ago:
                Scrapping First-past-the-Post is probably a good start.
                
                Good luck!
       
            Aurornis wrote 13 hours 6 min ago:
            > that they have to objectively ruin the site for everyone using
            it. See the API changes and killing off of third party apps.
            
            Third party app users were a very small but vocal minority. The API
            changes didn't drop their traffic at all. In fact, it's only gone
            up since then.
            
            The datacenter IP address blocks aren't just for scrapers, it's an
            anti-bot measure across the board. I don't spend much time on
            Reddit but even the few subreddits I visited were starting to
            become infiltrated by obvious bot accounts doing weird karma
            farming operations.
            
            Even HN routinely gets AI posting bots. It's a common technique to
            generate upvote rings - Make the accounts post comments so they
            look real enough, have the bots randomly upvote things to hide
            activity, and then when someone buys upvotes you have a selection
            of the puppet accounts upvote the targeted story. Having a lot of
            IP addresses and generating fake activity is key to making this
            work, so there's a lot of incentive to do it.
       
              direwolf20 wrote 11 hours 47 min ago:
              Reddit's traffic is almost exclusively propaganda bots.
       
              JDye wrote 12 hours 43 min ago:
              I agree that write-actions should be protected, especially now
              when every other person online is a bot. As for read-actions,
              I'll continue to profit off those being protected too but I
              wouldn't be too bothered if something suddenly changed and all
              content across the internet was a lot easier to access
              programmatically.  I  think only harm can come from that data
              being restricted to the huge (nefarious) companies that can pay
              for that data or negotiate backroom deals.
       
          tokyobreakfast wrote 13 hours 32 min ago:
          > I've had enough of companies saying "you're connecting from an AWS
          IP address
          
          I run a honeypot and the amount of bot traffic coming from AWS is
          insane. It's like 80% before filtering, and it's 100% illegitimate.
       
            ghxst wrote 3 hours 42 min ago:
            Most of them abuse the ip pool attached to lambda from my
            experience.
       
          Aurornis wrote 13 hours 55 min ago:
          > I want exactly the same content visible no matter who you are or
          where you are connecting from
          
          The reason those IP addresses get blocked is not because of "who" is
          connecting, but "what"
          
          Traffic from datacenter address ranges to sites like Reddit is almost
          entirely bots and scrapers. They can put a tremendous load on your
          site because many will try to run their queries as fast as they can
          with as many IPs as they can get.
          
          Blocking these IP addresses catches a few false positives, but it's
          an easy step to make botting and scraping a little more expensive.
          Residential proxies aren't all that expensive, but now there's a
          little line item bill that comes with their request volume that makes
          them think twice.
          
          > We need more residential proxies, not less
          
          Great, you can always volunteer your home IP address as a start.
          There are services that will pay you a nominal amount for it, even.
       
          BoredPositron wrote 13 hours 56 min ago:
          I still "run" a small ISP with a few thousand residential ips from my
          scraping days. The requirements are laughable and costs were
          negligible in the early 2000s.
       
          ndiddy wrote 14 hours 15 min ago:
          If you look at the article, the network they disrupted pays software
          vendors per-download to sneakily turn their users into residential
          proxy endpoints. I'm sure that at least some of the time the user is
          technically agreeing to some wording buried in the ToS saying they
          consent to this, but it's certainly unethical. I wouldn't want to
          proxy traffic from random people through my home network, that's how
          you get legal threats from media companies or the police called to
          your house.
       
            dataviz1000 wrote 13 hours 23 min ago:
            They provide an SDK for mobile developers. Here is a video of how
            it works. [0] They don't even hide it.
            
            [0]
            
   URI      [1]: https://www.youtube.com/watch?v=1a9HLrwvUO4&t=15s
       
              ndiddy wrote 12 hours 50 min ago:
              Of course they're pitching it like everything's above board, but
              from the article:
              
              > While many residential proxy providers state that they source
              their IP addresses ethically, our analysis shows these claims are
              often incorrect or overstated. Many of the malicious applications
              we analyzed in our investigation did not disclose that they
              enrolled devices into the IPIDEA proxy network. Researchers have
              previously found uncertified and off-brand Android Open Source
              Project devices, such as television set top boxes, with hidden
              residential proxy payloads.
       
                calgoo wrote 2 hours 7 min ago:
                I love how its the "evil" Open Source project devices, and
                "other app stores" that are the problem, not the 100s of
                spyware ridden crap that is available for download from the
                Play store. Would be interesting to know how many copies of the
                SDK was found and removed from their own platform.
       
                direwolf20 wrote 11 hours 47 min ago:
                If popup ads that open the play store are ethical, this is
                ethical.
       
            londons_explore wrote 14 hours 8 min ago:
            > that's how you get legal threats from media companies or the
            police called to your house.
            
            Or residential proxies get so widespread that almost every house
            has a proxy in, and it becomes the new way the internet works -
            "for privacy, your data has been routed through someone else's
            connection at random".
       
              Imustaskforhelp wrote 13 hours 39 min ago:
              > Or residential proxies get so widespread that almost every
              house has a proxy in, and it becomes the new way the internet
              works - "for privacy, your data has been routed through someone
              else's connection at random".
              
              Is this a re-invention of tor, maybe I2P?
       
                chii wrote 3 hours 43 min ago:
                > Is this a re-invention of tor
                
                in a way, yes - the weakness of tor is realistically the lack
                of widespreadness. Tor traffic is identifiable and blockable
                due to the relatively rare number of exit nodes (which also
                makes it dangerous to run exit nodes, as you become "liable").
                
                Engraining the ideas of tor into regular users' internet usage
                is what would prevent the internet from being controlled and
                blockable by any actor (except perhaps draconian gov't over
                reach, which while can happen, is harder in the west).
       
                rolph wrote 12 hours 6 min ago:
                IP8 address tumbler? to wit, playing the shell game, to
                obstruct direct attribution.
       
          direwolf20 wrote 14 hours 26 min ago:
          You can run one, something like ByteLixir, Traffmonetizer, Honeygain,
          Pawns, there are lots more, just google "share my internet for money"
          
          What will you be proxying? Nobody knows! I haven't had the police at
          my house yet.
          
          Seems a great way to say "fuck you" to companies that block IP
          addresses.
          
          You may see a few more CAPTCHAs. If you have a dynamic IP address,
          not many.
       
            dist-epoch wrote 13 hours 27 min ago:
            How much can you make if you run all of them at the same time?
            
            Doesn't the ISP detect them?
       
              direwolf20 wrote 11 hours 54 min ago:
              like $3 a month
              
              and why would they
       
          xg15 wrote 14 hours 27 min ago:
          Also, nevermind the tech companies building their own proxy networks,
          such as Find My or Amazon Sidewalk.
       
            enneff wrote 13 hours 2 min ago:
            How is Find My a proxy network?
       
              direwolf20 wrote 11 hours 45 min ago:
              In the literal sense. Your traffic is proxied through devices
              belonging to unwilling strangers.
       
                enneff wrote 11 hours 37 min ago:
                By “your traffic” you mean device location reports? Or
                something else?
       
                  fc417fc802 wrote 6 hours 39 min ago:
                  Yes. It's "edge routing" that happens to be restricted to a
                  single operator.
       
                  DANmode wrote 8 hours 27 min ago:
                  The data that powers the app tracking your devices, shown on
                  your devices, yes.
                  
                  (What else?)
       
                    enneff wrote 8 hours 6 min ago:
                    I don’t know. I wouldn’t have thought of myself as
                    proxying other people’s traffic by carrying my iPhone
                    around. (For one thing, it’s my own phone that initiates
                    all the activity- it monitors for Apple devices, the
                    devices don’t reach out to my phone.) I can see how you
                    could frame it that way, though. I just thought they might
                    be referring to something else that I didn’t know about.
       
                      MBCook wrote 7 hours 43 min ago:
                      I remain skeptical. I can understand how one would might
                      see it that way, but I think it’s stretching the word
                      proxy too far.
                      
                      Devices on Apple’s Find My aren’t broadcasting
                      anything like packets that get forwarded to a destination
                      of their choosing. I would think that would be a
                      necessity to call it “proxying”.
                      
                      They’re just broadcasting basic information about
                      themselves into the void. The phones report back what
                      they’ve picked up.
                      
                      That doesn’t fit the definition to me.
                      
                      I absolutely don’t mind the fact that my phone is doing
                      that. The amount of data is ridiculously minuscule. And
                      it’s sort of a tit for tat thing. Yeah my phone does
                      it, but so does theirs. So just like I may be helping you
                      locate your AirTag, you would be helping me locate mine.
                      Or any other device I own that shows up on Find My.
                      
                      It’s a very close to a classic public good, with the
                      only restriction being that you own a relevant device.
       
                        DANmode wrote 6 hours 44 min ago:
                        > aren’t broadcasting anything like packets that get
                        forwarded to a destination of their choosing
                        
                        Protocol insists the data only goes back to owner
                        device or Apple server.
       
            a456463 wrote 13 hours 19 min ago:
            Agreed. With things people paid for and using our wifi data to
            build their "positioning dbs" that you can't block or turn off on
            your phone, without "rooting" your own device.
       
          packetslave wrote 14 hours 29 min ago:
          > Reddit is an example which totally blocks all data to
          non-residential IP's.
          
          No, we don't.
       
            leftouterjoins wrote 11 hours 31 min ago:
            everything on Reddit is so locked down it’s useless. even if you
            do get to post something useful some basement dwelling mod will
            block it for an arcane interpretation of one of the subreddits 14
            rules.
       
            a456463 wrote 13 hours 18 min ago:
            Have you tried using it logged out on a vpn? It is impossible.
       
            thot_experiment wrote 13 hours 50 min ago:
            I have never interacted with a reddit employee who wasn't actively
            gaslighting me about the platform. Do you even use the site? I
            talked to a PM recently who genuinely thought the phone app was
            something people liked.
       
              MBCook wrote 7 hours 39 min ago:
              There are people who actively like it.
              
              I don’t. But they 100% exist.
       
              direwolf20 wrote 11 hours 45 min ago:
              They probably get paid by how many people believe their nonsense.
       
            dvngnt_ wrote 13 hours 56 min ago:
            there are several times where I've had to disable PIA to access
            reddit's login page
       
            piskov wrote 14 hours 3 min ago:
            Yes you do.
            
            Private VPS for personal VPN in Netherlands (digital ocean), then
            Hungary (some small local DC) — both are blocked from day one.
            
            > You've been blocked by network security. To continue, log in to
            your Reddit account or use your developer token. If you think
            you've been blocked by mistake, file a ticket below and we'll look
            into it.
       
              what wrote 7 hours 36 min ago:
              Sounds like you just need to sign in or use the api?
       
              Imustaskforhelp wrote 13 hours 38 min ago:
              Proton VPN sometimes (mostly?) has this issue too. It's a bit of
              an hit or miss in there iirc but I have definitely seen the last
              message of your comment.
       
            hackeman300 wrote 14 hours 13 min ago:
            Try browsing from any Mullvad vpn. You will be "blocked by network
            security"
       
              yuliyp wrote 10 hours 28 min ago:
              ... if you're logged out. Log in so they don't have to lump you
              in with every scraper you're sharing a subnet with.
       
              edoceo wrote 13 hours 24 min ago:
              I use mullvad regularly & visit reddit from that connection - it
              works. But! You have to sign-in.
       
              gruez wrote 14 hours 11 min ago:
              That's just mullvad's IP pool being banned. The other VPN
              providers I use aren't banned, or at least are only
              intermittently banned that I can easily switch to another server.
       
            direwolf20 wrote 14 hours 28 min ago:
            Have you tried it? Every new account will be shadowbanned and if
            it's shared you often get blank page 429. None of this was true
            before the API shutdown.
       
              3rodents wrote 13 hours 45 min ago:
              That’s not my experience, using various VPNs, public networks,
              Cloudflare and Apple private relays. A captcha is common when
              logged out but that’s about it, I have not encountered any
              shadow bans. I create a new account each week.
       
              gruez wrote 14 hours 8 min ago:
              >Every new account will be shadowbanned
              
              That's not the same as "blocks all data to non-residential IP's"?
              
              >if it's shared you often get blank page 429. None of this was
              true before the API shutdown.
              
              See my other comment. I agree there's a non-zero amount of VPNs
              that are banned from reddit, but it's also not particularly hard
              to find a VPN that's not banned on reddit.
       
                interloxia wrote 13 hours 58 min ago:
                Probably not hard but my poor little innocent VPS at Hetzer
                that I have had for years is denied and that makes me sad.
       
        samsullivan wrote 14 hours 41 min ago:
        The need for proxies in any legitimate context became obsolete with
        starlink being so widespread. Throw up a few terminals and you have
        about 500-2k cgnat IP addresses to do whatever you like.
       
          JDye wrote 13 hours 37 min ago:
          2k IPs is not enough to do most enterprise scale scraping. Starlink's
          entire ASN doesn't seem to have enough V4 addresses to handle it
          even.
       
            fc417fc802 wrote 6 hours 22 min ago:
            If they're CGNAT then unless Starlink actively provides assistance
            to block them it won't matter.
            
            As someone who wants the internet to maintain as much anarchy as
            possible I think it would be nice to see a large ISP that actively
            rotated its customer IPv6 assignments on a tight schedule.
       
            chatmasta wrote 9 hours 43 min ago:
            The actual secret is to use IPv6 with varied source IPs in the same
            subnet, you get an insane number of IPs and 90% of anti-scraping
            software is not specialized enough to realize that any IP in a /64
            is the same as a single IP in a /32 in IPv4.
       
              cferry wrote 3 hours 18 min ago:
              > any IP in a /64 is the same as a single IP in a /32 in IPv4
              
              This is very commonly true but sadly not 100%. I am suffering
              from a shared /64 on which a VPS is, and where other folks have
              sent out spam - so no more SMTP for me.
       
        whartung wrote 14 hours 43 min ago:
        My understanding is that routing through residential IPs is a part of
        the business of some VPN providers. I don't know how above board they
        are on this (as in notifying customers that this may happen, however
        buried in the usage agreement, or even allowing them to opt out).
        
        But, my main point, is that the whole business is "on the up and up" vs
        some dark botnet.
       
          kawsper wrote 10 hours 13 min ago:
          Oxylabs sells proxies for scrapers, I suppose you can use the
          socks-proxy as a VPN, and they claim to use Honeygain.
          
          Honeygain is a platform where people sell their residential internet
          connection and bandwidth to these companies for money.
          
          For comparison Honeygain pays someone 10 cents per GB, and Oxylabs
          sells it for $8/GB.
       
            aussieguy1234 wrote 7 hours 40 min ago:
            That takes buying low and selling high to a whole new level
       
          nielsbot wrote 14 hours 37 min ago:
          FTA
          
          > While operators of residential proxies often extol the privacy and
          freedom of expression benefits of residential proxies, Google Threat
          Intelligence Group’s (GTIG) research shows that these proxies are
          overwhelmingly misused by bad actors
       
            direwolf20 wrote 14 hours 27 min ago:
            Google's definition of a "bad actor" is someone who wants to use
            Google without seeing the ads. Or Kagi. Or an AI other than Gemini.
       
        kotaKat wrote 15 hours 3 min ago:
        I'm actually a little shocked seeing that there was a WebOS variant of
        the residential proxying SDK endpoint. Does that mean there might be a
        bit more unchecked malware lurking behind the scenes in the LG
        ecosystem?
        
        Personally I'm surprised they didn't have a Samsung option.
       
          wincy wrote 14 hours 28 min ago:
          I keep my brand new LG C5 totally disconnected from the internet and
          use my Apple TV for movie watching. I’m not going to trust a
          company like LG to secure their devices.
       
            xnx wrote 14 hours 6 min ago:
            >  trust a company like LG to secure their devices.
            
            They have an interest in securing their devices so they can sell
            proxy service themselves.
       
        xyzzy_plugh wrote 15 hours 11 min ago:
        > These efforts to help keep the broader digital ecosystem safe
        supplement the protections we have to safeguard Android users on
        certified devices. We ensured Google Play Protect, Android’s built-in
        security protection, automatically warns users and removes applications
        known to incorporate IPIDEA SDKs, and blocks any future install
        attempts.
        
        Nice to see Google Play Protect actually serving a purpose for once.
       
          direwolf20 wrote 11 hours 48 min ago:
          Does it also block unwanted traffic from Google apps or does it have
          a particular hatred for companies that interfere with Google's
          business model?
       
            tgsovlerkhgsel wrote 11 hours 33 min ago:
            Play Protect blocks malicious apps, not network traffic, so no, it
            obviously doesn't interfere with Google's apps.
            
            AFAIK it also left SmartTube (an alternative YouTube client) alone
            until the developer got pwned and the app trojanized with this kind
            of SDK, and the clean versions are AFAIK again being left alone. No
            guarantee that it won't change in the future, of course, but so far
            they seem to not be abusing it.
       
              direwolf20 wrote 10 hours 17 min ago:
              Does malicious mean interfering with Google's business model, or
              does it include intrusive advertising?
       
                ThePowerOfFuet wrote 3 hours 40 min ago:
                malicious ≠ intrusive.
       
          trollbridge wrote 13 hours 59 min ago:
          Yeah, it serves the purpose of blocking this kind of proxy traffic
          that isn't in Google's personal best interests.
          
          Only Google is allowed to scrape the web.
       
            miki123211 wrote 5 hours 12 min ago:
            Google does not use residential proxies.
            
            This does nothing against your ability to scrape the web the Google
            way, AKA from your own assigned IP range, obeying robots.txt, and
            with an user agent that explicitly says what you're doing and gives
            website owners a way to opt out.
            
            What Google doesn't want (and I don't think that's a bad thing) is
            competitors scraping the web in bad faith, without disclosing what
            they're doing to site owners and without giving them the ability to
            opt out.
            
            If Google doesn't stop these proxies, unscrupulous parties will
            have a competitive advantage over Google, it's that simple. Then
            Google will have to decide between just giving up (unlikely) or
            becoming unscrupulous themselves.
       
              ryanjshaw wrote 3 hours 30 min ago:
              > This does nothing against your ability to scrape the web the
              Google way
              
              I thought that Google has access to significant portions of the
              internet that non-Google bots won’t have access to?
       
                morkalork wrote 2 hours 28 min ago:
                Their crawler has known IPs that get a white-glove treatment by
                every site with a paywall for example
       
            1vuio0pswjnm7 wrote 7 hours 10 min ago:
            "Only Google is allowed to scrape the web."
            
            If I'm not mistaken, the plaintiffs in the US v Google antitrust
            litigation in the DC Circuit tried to argue that website operators
            are biased toward allowing Google to crawl and against allowing
            other search engines to do the same
            
            The Court rejected this argument because the plaintiffs did not
            present any evidence to support it
            
            For someone who does not follow the web's history, how would one
            produce direct evidence that the bias exists
       
              SkiFire13 wrote 2 hours 30 min ago:
              > For someone who does not follow the web's history, how would
              one produce direct evidence that the bias exists
              
              Take a bunch of websites, fetch their robots.txt file and check
              how many allow GoogleBot but not others?
       
            vachina wrote 7 hours 50 min ago:
            This is demonstrably false by the success of many scrapers from AI
            companies.
       
              Nextgrid wrote 7 hours 27 min ago:
              LLMs aren't a good indicator of success here because an LLM
              trained on 80% of the data is just as good as one trained on
              100%, assuming the type/category of data is distributed evenly.
              Proxies help when you do need to get access to 100% of the data
              including data behind social media loginwalls.
       
            viraptor wrote 11 hours 16 min ago:
            Have you got any proof of Google scraping from residential proxies
            users don't know about, rather than from their clearly labelled AS?
            Otherwise you're mixing entirely different things into one claim.
       
              misir wrote 10 hours 36 min ago:
              That's the whole point. Websites that try to block scraping
              attempts will let google scrape without any hurdle because of
              google's ads and search network. This gives google some advantage
              over new players because as a new name brand you are hardly going
              to convince a website to allow scraping even if your product may
              actually be more advantageous to the website (for example assume
              you made a search engine that doesn't suck like google, and
              aggregates links instead of copying content from your website).
              
              Proxies in comparison can allow new players to have some playing
              chance. That said I doubt any legitimate & ethical business would
              use proxies.
       
              idiotsecant wrote 10 hours 41 min ago:
              I don't think parent post is claiming that Google is using other
              people's networks to scrape the web only that they have a strong
              incentive to keep other players from doing that.
       
                viraptor wrote 10 hours 33 min ago:
                No, there are other scrapers that Google doesn't block or
                interact with. You can even run scraping from GCP. This has
                nothing to do with "only Google is allowed to scrape".
                They even host apps which exist for scraping data, like
                
   URI          [1]: https://play.google.com/store/apps/details?id=com.soci...
       
            a456463 wrote 13 hours 12 min ago:
            Yup exactly. Google must be the only one allowed to scrape the web.
            Google can't have any other competition. Calling it in "user's best
            interest" is just like their other marketing cons: "play integrity
            for user's security" etc
       
       
   DIR <- back to front page