Cheaper Software
“People get bored, people get headaches. Computers don’t,” said Bill Herr, a lawyer who used to work for a chemical company.
“The economic impact will be huge,” said Tom Mitchell, chairman
of the machine learning department at Carnegie Mellon University in
Pittsburgh. “We’re at the beginning of a 10-year period where
we’re going to transition from computers that can’t understand
language to a point where computers can understand quite a bit about
language.”
Nowhere are these advances clearer than in the legal world.
E-discovery technologies generally fall into two broad categories that can
be described as “linguistic” and “sociological.”
The most basic linguistic approach uses specific search words to find and
sort relevant documents. More advanced programs filter documents through a
large web of word and phrase definitions. A user who types
“dog” will also find documents that mention “man’s
best friend” and even the notion of a “walk.”
The sociological approach adds an inferential layer of analysis, mimicking
the deductive powers of a human Sherlock Holmes. Engineers and linguists at
Cataphora, an information-sifting company based in Silicon Valley, have
their software mine documents for the activities and interactions of people
— who did what when, and who talks to whom. The software seeks to
visualize chains of events. It identifies discussions that might have taken
place across e-mail, instant messages and telephone calls.
Then the computer pounces, so to speak, capturing “digital
anomalies” that white-collar criminals often create in trying to hide
their activities.
For example, it finds “call me” moments — those incidents
when an employee decides to hide a particular action by having a private
conversation. This usually involves switching media, perhaps from an e-mail
conversation to instant messaging, telephone or even a face-to-face
encounter.
“It doesn’t use keywords at all,” said Elizabeth
Charnock, Cataphora’s founder. “But it’s a means of
showing who leaked information, who’s influential in the organization
or when a sensitive document like an S.E.C. filing is being edited an
unusual number of times, or an unusual number of ways, by an unusual type
or number of people.”
The Cataphora software can also recognize the sentiment in an e-mail
message — whether a person is positive or negative, or what the
company calls “loud talking” — unusual emphasis that
might give hints that a document is about a stressful situation. The
software can also detect subtle changes in the style of an e-mail
communication.
A shift in an author’s e-mail style, from breezy to unusually formal,
can raise a red flag about illegal activity.
“You tend to split a lot fewer infinitives when you think the F.B.I.
might be reading your mail,” said Steve Roberts, Cataphora’s
chief technology officer.
Another e-discovery company in Silicon Valley, Clearwell, has developed
software that analyzes documents to find concepts rather than specific
keywords, shortening the time required to locate relevant material in
litigation.
Last year, Clearwell software was used by the law firm DLA Piper to search
through a half-million documents under a court-imposed deadline of one
week. Clearwell’s software analyzed and sorted 570,000 documents
(each document can be many pages) in two days. The law firm used just one
more day to identify 3,070 documents that were relevant to the
court-ordered discovery motion.
Clearwell’s software uses language analysis and a visual way of
representing general concepts found in documents to make it possible for a
single lawyer to do work that might have once required hundreds.
“The catch here is information overload,” said Aaref A. Hilaly,
Clearwell’s chief executive. “How do you zoom in to just the
specific set of documents or facts that are relevant to the specific
question? It’s not about search; it’s about sifting, and
that’s what e-discovery software enables.”
For Neil Fraser, a lawyer at Milberg, a law firm based in New York, the
Cataphora software provides a way to better understand the internal
workings of corporations he sues, particularly when the real decision
makers may be hidden from view.
He says the software allows him to find the ex-Pfc. Wintergreens in an
organization — a reference to a lowly character in the novel
“Catch-22” who wielded great power because he distributed mail
to generals and was able to withhold it or dispatch it as he saw fit.