Gathering Business Data? Be Careful, Mom is Watching – A Comment on Data Scraping and the Compulife Case

September 20, 2020

IP Watchdog

When people say that “data is the new oil,” they’re talking about new ways of creating wealth. No matter what business you’re in, success today depends on learning everything you can about your customers and competitors. And there’s so much information sloshing around the internet, every industry—from restaurants to manufacturers to sports teams—is busy extracting insights from “big data” analysis.

But, like drilling for oil, prospecting for data sometimes gets your hands dirty. Recently, a court ruled that a startup company providing life insurance quotes to consumers had created its database – the engine of its busines – by taking data from an existing company (Compulife) that had built theirs from scratch. The new company didn’t break in and steal the whole thing. Instead, it used robotic software to “scrape” the information from Compulife’s website, by pretending to be a member of the public – actually by pretending to be 43 million members of the public, which is how many rate quotes they were able to extract in only four days.

Having pumped out all that data, they were able to understand the competitor’s system and replicate it. When hauled into court, they shrugged their shoulders and pointed out that the source website was open to the public and they were just gathering what was readily available. Surely, they argued, this couldn’t be trade secret misappropriation because the information wasn’t secret. Not so fast, said the court. Compulife expected that real individual people, not swarms of automated “bots,” would be using their website. The data, it concluded, had been acquired by “improper means.”

Peter Toren, a fellow trade secret practitioner, recently penned a two-part article lamenting this decision. While I very much respect Peter’s views, on this one I firmly believe he was wrong and the court was right.

Whether or not information can be gathered from the internet this way is obviously important. But the issue is not so much about bots and data as it is about your Mom.

Stay with me here, you’ll see what I mean.

From Tents to Bots

Back in 1970, the DuPont company was building a new chemical plant. If a competitor could get into the building site and examine the layout it could understand important aspects of DuPont’s secret processes. So, DuPont erected a fence around the perimeter, with guards and no-trespassing signs. One day the construction manager noticed a plane making multiple passes at an altitude low enough to read the registration number. It turned out that a rival company had hired the pilot to fly over the site and take pictures.

Faced with a lawsuit, the competitor claimed that the construction was in “plain view,” and it had broken no laws. The judge wasn’t impressed. DuPont shouldn’t have to erect a tent over the worksite to prevent what it called “a school-boy’s trick.” This should be no surprise, he explained, because “our ethos has never given moral sanction to piracy” and the “marketplace should not deviate far from our mores.”

Four years later, the U.S. Supreme Court relied on the DuPont case in describing why we enforce trade secret rights. It said that the “maintenance of standards of commercial ethics and the encouragement of invention” are the twin policy pillars of trade secret law, reflecting the “necessity of good faith and honest, fair dealing” in business.

Five years after that, the first version of the Uniform Trade Secrets Act was published, and it defined theft as including acquisition of information by “improper means.” The identical standard applies under the more recent federal law, the Defend Trade Secrets Act. And both of those statutes say that “improper means” “includes theft, bribery, misrepresentation, breach or inducement of a breach of a duty to maintain secrecy, or espionage through electronic or other means.”

In much of the IP world, we love bright lines and sharp edges. For example, to attack a patented invention for lack of novelty, it’s enough to find an academic paper covered with dust in an obscure library. Publication is sudden death. Predictability is highly valued.

Perhaps that’s why some IP lawyers find trade secret laws to be uncomfortable, because they are so, well – flexible. Perhaps this is why my friend Peter misread the Uniform Trade Secrets Act (UTSA) and Defend Trade Secrets Act (DTSA) as restricting “improper means” to a closed set of behaviors, rather than providing a list of examples, which the official comments to the UTSA describe as “a partial listing.” Perhaps that’s why he claimed that the Compulife case was the “first appellate decision in more than 50 years that has relied upon” the DuPont case, when the Supreme Court had leaned on it so firmly back in 1974.

Adding Bricks to the Edifice

Trade secret laws in the U.S. grow from our common law tradition, in which judges wrestling with novel arguments end up adding bricks to the edifice of principles. The foundation of it all, as the Supreme Court said, is the idea that business behavior should be ethical. And as we all know, ethics is highly contextual and situational. Faced with trying to regulate our own personal conduct, we have to be content with suggestive questions, such as “would you be comfortable with this appearing in the front page news tomorrow morning?” or – this is my favorite, and what I promised you earlier – “what would your mother think if she were looking over your shoulder right now?”

It’s not just the idea of “improper means” that imposes flexibility on trade secret law. Other key concepts are similarly driven by context. For example, we require that the trade secret holder have exercised “reasonable efforts” to maintain control over information it claims as a trade secret. We disallow protection for information that is “readily ascertainable,” but only when it can be ascertained “by proper means.” And we approve of reverse engineering (taking something apart to discover how it works), except when the thing was acquired unfairly.

None of this should be particularly troubling in the abstract, since we all (or the vast majority of us) want to be ethical actors. But the law keeps us on our toes with its ambiguity. Saving space to condemn creative thieves means that we risk getting in trouble if we go too close to the line, such as it is. This risk is made more complex by changing context. Today, DuPont would be out of luck trying to keep its construction site private, what with Google Earth and other satellite imagery.

Indeed, with rapid advances in technology we regularly introduce not only useful innovations to serve society, but also tools that can be used to capture another’s competitive advantage. The public-facing website resting on a large database gives us a good example of the conundrum. How do we balance the rights of those who want to make useful information available in limited ways against those who claim the right to use what can be found in plain sight?

Maintaining Competitive Advantage

As I’ve already explained, from the legal perspective, I think that the court in the Compulife case got it right, because what the startup did seemed unfair and improper. But how do we translate this modern version of the DuPont case into some guidelines for handling data in the age of ubiquitous data? What can owners of collections of useful data do in order to keep control of their competitive advantage?

First, where the commercial relationship is business to business, rely on carefully drafted contracts to limit the risk that the other party may misuse the information to which they’ve been given access.

Second, in a more public-facing environment, use not only restrictive EULA’s (end user license agreements) but also technical measures to make data extraction difficult, at least where this is possible without degrading the usefulness of the product or service being offered.

Third, make it obvious to any user that you don’t want your data misused. Provide warnings that are impossible to miss, like the “no trespassing” sign hanging on the fence. If this ever turns into a legal fight, the court will likely be impressed by evidence that the defendant must have known he was stepping over a line.

And what about those of you who are looking for creative ways to gather data? Whatever you’re thinking of doing, know that Mom is watching.

linkedin facebook pinterest youtube rss twitter instagram facebook-blank rss-blank linkedin-blank pinterest youtube twitter instagram