The study of business sometimes has a dark side.
Among the 2019-2020 recipients of the Wharton Social Impact Initiative Fund for Social Impact Research was Hamsa Bastani, Wharton assistant professor of operations, information and decisions, and Pia Ramchandani, a doctoral student in the same department. The women have been analyzing dark web data related to sex trafficking (human trafficking for the purpose of sexual exploitation) to provide new insights into online sex-trafficking recruiting practices and, more broadly, illicit supply chains on online platforms. Specifically, they are collaborating with TellFinder Alliance, a a counter-human trafficking partner network company that provides a suite of tools for human trafficking investigators to search deep-web content and discover hidden connections in online commercial sex content.
Knowledge@Wharton High School spoke with Bastani, who develops novel machine learning algorithms for data-driven decision-making, to find out more details about the long-term research project. In the process, we discovered insights about the business side of the dark web. Here are six key takeaways from Bastani:
The dark web? Also known as the deep web, “this is part of the Internet that can only be accessed through specialized software. So, if you were to go on Google Chrome, you wouldn’t necessarily be able to access some of these pages. The Silk Road and some of these other web pages are what we would call the dark web. A lot of the stuff that happens online nowadays, especially if it’s illegal like human trafficking or stuff having to do with drugs, happens on the dark web, which helps the users remain anonymous and untraceable.”
The Wharton connection. “Traditionally, U.S. law enforcement like the Federal Bureau of Investigation had specific local or domain knowledge to follow suspicious perpetrators. Now with all these machine learning tools, TellFinder Alliance and these companies are trying to scale this up by getting a larger picture, by accessing the deep web. They’ve been looking at 80 or 90 websites and about 250 historical websites where a lot of this commercial sex activity happened, and they’ve built machine-learning predictors to try to identify which ads are related to more traditional escort services and which ones are related to actual sex trafficking that we should be concerned about. They brought us in to bring in a more operational perspective to see how these commercial sex web properties are actual supply chains, where they’re sourcing workers, and where they’re making those sales.”
Deep web discoveries. “Our research findings are preliminary. It seems like the ads are more consolidated than we might have thought. Certain phone numbers are associated with tens of thousands of these posts within a nine-month period. That suggests that it’s a very consolidated business. These individuals are not making posts in a distributed way, but it’s a very centralized procedure. We’ve seen some evidence that a lot of recruitment happens in the Midwest and a lot of sales are happening in coastal regions.”
Backpage. “Backpage was one of the major platforms on which these elicit transactions were happening, and the Trump administration took it down. The positive aspect is what the presidential administration was hoping for — that this will disrupt these supply chains and we’ll see a reduction in trafficking and violence against women. The negative effect is that these were known channels. Escorts knew how to navigate this platform and how to find people that might be relatively safe. Now they’re thrown in the dark and they’re more vulnerable and likely to be trafficked. FBI agents have been using this portal to go after certain perpetrators. Now that this portal is gone, that information is gone too. Part of what we’re looking at is how this disruption has helped or hurt violence against women. Since then, a lot of other websites have popped up and this whole industry has become fragmented or chaotic, which is not necessarily a good thing. Our preliminary results suggest that it had a bunch of unintended effects.”
Data, data, data. “We’re in a time period where you can make contributions towards these areas with increasingly better availability of data that are interesting from a more academic perspective. Dark web data gives you a very large scale. For example, in our data set from TellFinder, just in a nine-month period there are over a million ads that are being posted in each of these websites, especially the larger ones. This Is really the kind of setting where we want to apply machine learning techniques and use data-driven tools to see if we can help law enforcement… There are a lot of interesting machine learning questions. Usually in machine learning you assume that your data is generated using some independent process and then you try to build a model that predicts some outcomes. But here, the adversaries are generating the data because they’re posting the ads and then simultaneously trying to evade detection and then also reach their clients. That introduces some interesting dynamics into data collection and also how you want to train your models to get good detection rates. There are a lot of rich and interesting problems.”
Attention, future social responsibility managers. “I did my PhD at Stanford University and worked with the Stanford Center for Ocean Solutions. I started a collaboration with Global Fishing Watch where they were using remote sensing data and satellite data to track illicit behavior [with fishing vessels] on the ocean. If you’re a fisherman in the middle of the Pacific Ocean for months at a time, then often that paves the way to labor abuse and illegal fishing labor exploitation. This is an area with lack of transparency where people could get away with a whole bunch of things. Global Fishing Watch is a nonprofit that is specifically for companies to come in, look at the vessels they’re sourcing seafood from, and try to understand if the suppliers they’re purchasing from are trustworthy. So, we’ve looked at how you can improve corporate social responsibility (CSR) at companies using data-driven tools. When we talk to the CSR leads of companies, there’s a lot of resistance to adopting these data-driven tools. You might imagine that these CSR leads often aren’t trained in that sort of data expertise. Eventually, that’s where the future is. We’re not going to be able to have on-the-ground knowledge, especially as our supply chains become more and more global and distributed and there’s all this unauthorized behavior happening. We need to start relying more on these data-driven tools, in combination with domain knowledge where people are experts in their fields. People who are going into CSR or other types of areas need to start having more of an understanding of these tools, like taking some basic machine learning classes.”
What is the intersection here between business and social impact? Why is technology so fundamental, and what does that say about the growing importance of technology in the business world?
Is the dark web a business?
Why does Hamsa Bastani say, "We need to start relying more on data-driven tools, in combination with domain knowledge" to understand how industries operate?