Peruse Reads Your Spreadsheets So You Don’t Have To

The time-consuming tedium of file search is San Francisco-based startup Peruse’s jumping-off point. It’s aiming to simplify locating files by changing how people search so they don’t have to remember exactly where to look or exactly what the file was called.

Peruse’s fix for this age-old tech problem is to use a natural-language question-and-answer interface, rather than a narrow keyword search, to allow for contextually relevant info to simplify and speed up your search. It’s launching its SaaS enterprise search product today onstage at TechCrunch Disrupt NY.

Founder Luke Gotszling describes this as “more of a human approach” to search. So you could, for instance, ask Peruse to locate “PowerPoints I edited last week” or “PDFs that Matthew Panzarino sent me” — saving a whole lot of guesswork about what the file was named and likely also reducing the amount of manual combing of keyword file search results you have to do.

Peruse plugs into existing cloud storage services, rather than requiring you install (yet) another piece of software. You can type your search into Peruse’s interface, or speak it if you prefer — whatever’s quicker and easier.

“File search is very unsatisfying,” Gotszling tells TechCrunch. “I’ll type in a few keywords and then whatever those keywords happen to match I will just get a list of files. That’s basically where things were in 1995 and that’s also basically where things are nowadays.

“For me that’s really unsatisfying. And we can go beyond that: as people that’s not how we think about information. I don’t think about these 20 keywords that will match this file exactly. We think about things like ‘oh this PowerPoint I edited last week.”

“Anyone who has worked in a medium-sized company or above typically finds out how difficult it is to locate information,” he adds. “People are used to spending hours a month just looking for information.”

Peruse’s natural language file search works for business documents of any file type, albeit the NLP tech only currently works for the English language. The service is also initially limited to documents stored in either Box or Dropbox cloud storage repositories — but it intends to expand to integrate with more such services.

That’s the first plank of Peruse. The second core feature it has in the works takes these NL search capabilities further — or rather, deeper — by allowing users to search for specific facts from inside a document, rather than having the tech simply point to the file itself.

The latter “deep insights” feature is not yet launched, but Peruse is now offering a waiting list for interested sign-ups. Initially the feature will work only with spreadsheets. Searches run in real-time but the spreadsheets themselves are background-indexed by Peruse, hence why it’s operating a wait-list.

  1. Peruse deepknowledge 1

  2. Peruse deepknowledge 2

  3. Peruse search

Who is this for? Gotszling gives the example of a benefits company that deals with a lot of small businesses, and needs to find the annual hourly labor cost from a restaurant’s spreadsheet. Instead of locating the file, loading it and using a control-F approach to try to locate the “annual hourly labor cost” figure, Peruse’s natural language tech parses the indexed spreadsheet data itself to return an answer to that specific query without the user having to dive into the file at all.

So, in other words, this is a machine that reads spreadsheets so you don’t have to. (The system does also specify where in the spreadsheet the data is located, so a human can go in and check.)

“You could say something like ‘what were the cocktail sales at the bar on Tuesday,'” says Gotszling, detailing the kind of complex spreadsheet search query Peruse can tackle. “No one is going to do this with control F because they’re not going to expect that to be a label somewhere. In fact if you were looking for this information and using the traditional search methodology it may be difficult for you to figure out do I search for ‘cocktail sales,’ do I search for ‘bar,’ do I search for ‘Tuesday,’ do I search for ‘sales’?

“In this case it’s so multifaceted that it may be hard for you to even know where to begin. And then if you pick something that’s too generic, then the next thing you know is you have 100 results and you’re paging through a 22-page spreadsheet for this information.”

There are obviously limits to this spreadsheet reader software. A spreadsheet comprising only numbers, without any text labels sign-posting the data, isn’t going to be intelligently parseable to anyone — not even a machine. And spreadsheets with limited labeling may also trouble it.

Data represented in two different formats within a spreadsheet can also cause confusion — with Gotszling noting the system can serve up multiple possible answers to a query (requiring the human steps back in and applies their intelligence to figure out that one answer is actually a proportion and the other is an absolute value, say).

Above all, the data you want to locate also has to be contained within a spreadsheet in the first place, although Gotszling says Peruse is considering expanding this feature to support regular documents and PDFs too. Other areas it’s also contemplating include team chat logs, emails and calendar entries. Nothing is confirmed as yet though.

Also on the slate for Peruse’s future: an Apple Watch app. “I think something like this can be particularly well suited to that form factor,” he says. “Especially when you’re not looking for a whole file but when you’re looking for a piece of information inside of that file… that can be easily presented in a sentence then I think this would work really well.”

“Right now it’s just going to work with spreadsheets,” he adds. “We built software that tries to understand this type of content in the same way that a person would understand it… To try to imitate the visual understanding because one thing we don’t want people to have to do is to reformat stuff so that it works for this.

“Perhaps ironically the more machine readable it is, the less understandable it is by our system because we built it so that it parses this information in the way that a human would.”

Gotszling says Peruse hasn’t been building the software with a particular industry in mind but argues it’s well suited to analysis-focused businesses, such as hedge funds and HR and benefits companies, i.e. those which are dealing with large amounts of information and often need to pick out a particular piece of data from large haystacks of info. (The NSA springs to mind on that — but clearly has its own specialized data-mining software.)

“I’ve had one discussions with a person who said that just scrolling the spreadsheet is something that takes them minutes of time. So generally people that deal with huge volumes of information — it has resonated more with them,” says Gotszling.

In terms of the competitive landscape beyond plain old keyword search, he points to Microsoft’s Power BI product, specifically for “spreadsheets intelligence,” although he also suggests that product is more focused on visualization than data retrieval. He also couches IBM’s Watson as another potential rival, but adds: “I haven’t really seen them tackle business documents with that technology but I could certainly see that being competitive.”

Gotszling was employee No. 1 at About.me, where he worked on infrastructure, including its structured search feature — so he’s evidently feeding some of that expertise into his new venture. He’s been bootstrapping and building the Peruse prototype since late October 2014, ramping up on the hiring front with core team members in the past month.

  1. peruse11

  2. peruse10

  3. peruse9

  4. peruse8

  5. peruse7

  6. peruse6

  7. peruse5

  8. peruse4

  9. peruse3

  10. peruse2

  11. peruse1



from TechCrunch http://feedproxy.google.com/~r/Techcrunch/~3/653mRS8FWmo/
via IFTTT

0 коммент.:

Отправить комментарий