Dorai Thodla

Information Tracking, Intelligence

InfoMinder

Posted on January 4, 2019March 31, 2019 by Dorai Thodla

What is it?

It is a service for tracking web pages for changes. You can sign up at the InfoMinder site for a free 30 day trial.

How Do People Use it?

Here are several ways people use InfoMinder.

Lead Generation – By tracking sites that publish RFPs (request for proposals) for equiment, services, grants and filtering them using keywords, people obtain leads. Other lead sources include job pages, job sites, question sites etc.
Competitive Intelligence – By tracking competitor’s pages people increase their awareness on new product introductions, new partnerships, customer wins and new news items. It is much more efficient to do this using a service like InfoMinder than manually checking pages frequently.
Marketing – Several PR companies use InfoMinder to track news propagation, coverage and news clipping services.
Sales – Sales people use InfoMinder for keeping informed about their customers. An event in a customer company may become a topic of conversation.
Tracking Industry – Many companies use InfoMinder to keep track of developments in their own industry. They do this by tracking pages on portals specific to an industry.
Aggregation of Information – Many government departments use InfoMinder to obtain and aggregate information specific to the departments.
Legal Research – Several legal professionals use InfoMinder to track trademark, copyright violations, legal events.
Compliance Information – Several industries use InfoMinder to track sites that provide information about compliance.
Corporate Research – Many corporate librarians use InfoMinder for product research, product technology information.
Internet Research – Since InfoMinder can track your bookmarks easily, anything you were interested enough to bookmark can be tracked easily

At iMorph, we have tools for Finding, Tracking and Mining Information. If you have interest in this space or have questions, feel free to leave a comment on this page. We will be happy to answer any questions.

Information Tracking, Intelligence

ScienceLog: An Eye On The Universe

Posted on January 4, 2019March 19, 2019 by Dorai Thodla

From AMON: An Eye on the Universe

AMON stands for Astrophysical Multimessenger Observatory Network. Its mission is to form a network of high-energy observatories across the globe that will search for previously unseen astrophysical signals and send alerts to more traditional telescopes in order to corroborate the possible celestial events.
Until the early 20th Century, astronomers relied almost exclusively on visible light to view the sky. Their telescopes, though steadily increasing inpower, were no different in this respect from the ones used by Galileo in 1610. Today we see much more of the universe by observing light from all across the electromagnetic spectrum. Gamma-ray-, x-ray-, infrared-, and radio-astronomy have revolutionized astronomical observation, as have the advent of space-based telescopes to complement those on the ground.
the past 50 years have seen tremendous progress in the sensitivity of instruments to detect cosmic rays—high-energy charged particles from outer space, such as protons and charged nuclei. Particle accelerators have enabled physicists to create, detect, and analyze other sub-atomic particles, such as neutrinos. These alternative messengers—particles that survive across vast distances in space—presented whole new avenues of exploration.

Computing infrastructure

The Research Computing and Cyberinfrastructure (RCC) unit of Information Technology Services enable scholars to do large-scale computations through linked services, including hardware, software, and personnel.
The High Performance Computing (HPC) system within RCC is a shared resource among dozens of researchers in a host of departmental and interdisciplinary units at Penn State that meets the dual data challenges presented by the AMON project. First, there is the need to continuously receive data from the triggering instruments. This requires computing systems with robust and consistently high “up-time.” The HPC has sub-systems rated at Tier III, with 99.999 percent up-time (less than five minutes of downtime annually).

Probabilistic Databases

“We are experimenting with a ‘probabilistic’ database that can collect disparate data, say, on neutrinos and gamma rays, and quickly determine the probability that both have come from the same source. This is cutting edge database work.”

Meta:

It is fascinating to read about how computers and new scientific software and databases help advanced research. Part of my random reading. Some of the most interesting Science Research articles come from tracking NSF (National Science Foundation) site.

One of the best ways to look at advances in Science and Technology is to track funded research.

Information Tracking, Intelligence

After That, What?

Posted on January 4, 2019March 19, 2019 by Dorai Thodla

Several years ago I was describing our product, InfoMinder to a friend. I told him how cool it is, to have something that tracks websites, and alerts you when they change. He patiently listened to my pitch and asked a simple question:

After that, what?

I did not understand it, at first. I asked him to elaborate. He asked “what do your customers do after they get these alerts?”. Frankly, I had not thought of that. I knew what I would do, but I had no idea what our customers did. Fortunately, it was easy to fix that problem. We just asked a bunch of them.

The moral of the story is that you need to think about what people do after they consume your product. You may be missing some opportunities for follow on products or providing a better solution.

Let me paint a few sample scenarios, for you.

You set up Google alerts. What do you do after you receive alert email?
You search for a company or topic and get a bunch of results from Google. What do you do, after that?
You locate an address using Google maps on your computer. What do you do next?
A friend texts you an address for a meeting. Now you have a smart phone with a good online map support. What do you do next?

The answer most of the time is – “it depends”.

I think there are some opportunities for some nifty tools to help the ‘after that, what’ problem. As we switch more and more to mobile devices to help us cope with our life, more such tools will be useful.

“After that what?’ is a good question to ask yourself, if you are looking for mobile product ideas. If you don’t want to do that, you can always ask me. I have a bunch of those problems.

Information Tracking, Intelligence

Freemium Model – Our Experience

Posted on January 4, 2019March 19, 2019 by Dorai Thodla

Here is some notes from the Freemium session at Google I/O. I wanted to annotate it with our experience with InfoMinder.

Why would you offer something for free?
To give the user time to learn about the product
User can provide distribution benefit (inviting other users to the product)
Network benefit – adding value in some way

The scale of conversions is in a 1-10% range.

In our case, we sent email message to about 20 friends and had a network of about 3000 users in 6 months. We provided the product free for a year. Then we started charging for a professional version but kept a basic version free for some time. Our conversion rate, initially was slightly more than 10%. We had a much bigger payoff when two companies OEMed our product.

In the past no business app, except Intuit, has penetrated the 10 employee or less market.
The model is going after new segments and new opportunities with a scale that can make it work.
Direct vs. indirect revenue models.
Feedburner – free version and $5/month version with some extra features.

Our basic version is still less than $2.5/month.

Product segmentation of free versus paid side
Viral/growth oriented things should be on the free side
Things that engage users in deeper behavior should be on the paid side
Can be difficult to draw the line within a company of what should be free vs. paid

The pricing was a bit difficult for us. We initially priced it at $14.99 per year. Gradually moved it up to $30/year. Our current products range from $30-$5000 per year based on the configuration. We have a lot more customers in the mid range $10/month than in others.

Start from the beginning with the model you hope to put in place – put your business model in beta at the same time you put your product into beta

This is easier said than done for two reasons. One is that product evolves and also your knowledge of customers. It took us a couple of years to find the sweet spots.

If the free product is so good that there’s no reason to pay, then an alternative is to limit capacity. Where do you draw the line? On the selection of features or on the capacity? – find your fanatical users and have them help you segment where the paywall should be

This is exactly what we did. Our trial product is almost as good as our full product. We charge by capacity. Others like WordPress do the same.

Establishing the value of your product is probably more important than establishing a pricing model right away.

This is very true. We continue to get business and keep customers even though there are some free products that provide some of the functionality. One of our customers told us that after trying out a few competitive products, they decided to stay with us due to a combination of quality and support.

Your instinct about what your customer will/will not pay for is likely wrong. Be flexible early in your business to be able to listen to your user feedback. Have the right premise – if you need 100M people to use your product and it’s not viral, it probably won’t work, for example.

Question: Freemium seems to be when dealing with the direct consumer. What is the balance between different models?

In mobile gaming space — collecting pennies per user over a lot of users can make a big difference. A mix of revenue types (direct/indirect) can work
Conversion rates will probably be between 2% and 5%, realistically. Most people won’t pay you for features (you may think they’re more valuable than the users do). Make people feel like they’re getting more value than just additional stuff than they’re already users.

Go ahead and read the wave. It has lots of information. Please feel free to ask us questions. You can email me or contact me directly (my co-ordinates are in the About link).

Information Tracking, Intelligence

Visualizing User Categories

Posted on January 4, 2019March 19, 2019 by Dorai Thodla

This is how I did in a few steps:

1. Extracted the titles from the list of registered users’ profiles

2. Edited the titles to remove certain keywords (for example Director, Manager, VP)

3. Copied the text and pasted it in Wordle tool using the Create option

4. Randomized the display till I found the one I am happy with

5. Used Snag it to capture the tag cloud and save it as an image

It is kind of cool. I plan to use it in the new version of iMorph website. Here are a few more things I would have loved to have:

1. I had counts of titles (a weight) but there is no way to pass wordle the info

2. I would have liked each word to be a hyperlink (to the list of titles).

I am exploring a few more tag cloud generation tools and see whether we can mashup some data with the clouds.

Why I did this:

I was doing this as a marketing exercise to try and find “the ideal user”. When you build generic tools like InfoMinder, you tend to have a wide variety of users. But it is interesting to find these patterns from your user base which provides a sense of direction for product enhancements as well as new products. But most of all, it provides clues on who are your potential channel partners. Typically they are the same ones who sell to your users.

Information Tracking, Intelligence

Discovering Relevant Sources of Information

Posted on January 4, 2019March 19, 2019 by Dorai Thodla

Discovering relevant sources of information is a recursive process. Let me explain.

Let us say that you want to track clean tech. The easiest way to find a list of sources is to type “cleantech” in your favorite search engine and look at top 20 distinctly different sources.

But that is just the first step. When you look at these you will find several interesting patterns. You may find portals about cleantech. You may find a directory of resources. You may find some popular bloggers or authors. The list goes on. Based on what you see, you can spawn more searches like these:

cleantech directory
cleantech products
cleantech vendors
cleantech lists

You can also find several related terms (or even ontologies) and include them in searches. An example would be “clean tech” OR “green tech” OR “renewable energy” etc.

The next step is to take each one of these sources and validate them. That is a bit more difficult. You may want to ask yourself the following questions:

How current are they?
How frequently do they update information?
Are they aggregators?
Do they support ads? (is there a correlation between their articles and company mentions with their ads)
Are they industry associations or industry publications?
Can you detect any biases?

In the end, you come up with a list of valuable sources. This provides a starting point. You can continuously monitor these sources using tools like InfoMinder and TopicMinder. In addition you can go a level deeper and find what their sources are and start tracking those sources as well.

You may want your own relevance ranking system. The search engines ranking may not really work for you. For example, if you are tracking an industry for early signals, highest page rank of the site may be completely irrelevant to your needs. For example my ranking criteria for a certain topic would be:

Some kind of source rank (which Google does well)
Currency (How current is the information?)
Authority (are the authors/columnists have a large following? Are they retweeted, blogged, linked to? Do they have high Social ranking like Klout scores/LinkedIn connections)
Is this their area of research? A topic cloud created from their recent columns and posts can give you some indication.

Discovery is recursive and a continuous process. If the information is that important (some thing you may need to act upon) this additional investment in validation and customization may be worth the effort.

Meta:

Updated on Nov 12, 2012

Information Tracking, Intelligence

InfoTools Survey Results

Posted on January 4, 2019March 19, 2019 by Dorai Thodla

Yesterday, I gave a talk on InfoTools: Beyond Search at TiE Chennai. The slides of the presention are here. I think it went well, but I think, if I had cut down the slides and talk and gave more demos, it would have gone even better. Perhaps next time.

Before starting the talk, I requested people to give me (written) answers to three questions:

What are your information needs?
What are your problems with information?
What tools do you use to manage information?

The questions, perhaps were a bit vague. I realized that after going through the answers. They varied in their level of granularity (specific vs generic problems) and the definition of information itself. But here they are (slightly modified to reduce redundancy).

Here is what I got from the survey:

What are your information needs?

Potential customer info
Right info at the right time (whenever I need it)
To structure, unstructured web data
Technology and Process to manage large newspaper portal
Should be current, relevant (to my context. Should lead to (help?) actual decisions
I need reference (information) of various consultants relating to start of business viz cost, web, management etc.
Need for current information
Need (to handle?) information from multiple sources and formats
Collating information from multiple sources
Information about competition
About marketability and segments
Company address information
Company finance (annual reports)
Executives within the company
Trade details, products, services etc.
Sales leads
Knowledge enhancement
Learning about old friends/acquaintances/family
To learn to grow personally & business
Needs to be local search for providers near to me (for ex: a photo copier shop near to my house)
Technical solutions (day to day) for career and personal growth
My business is providing information based services, package with recommendations. So need for information varies.
Various technologies in market
Information about market situation
About stock/companies performance
Details/support to solve issues
Products available in the market for specifics(?)
Focused News
Similar business entity info
Public info of competitor
At a business level – market feelers about demand, ease of vendor options availability
At an execution/implementation level – latent trends in tech
Updated knowledge
Price information about products etc.
Information about technology
Looking for acquiring an IT company. Need info on the industry they are in (macro) and more about that company (micro)
Collect, compile for pattern understanding, plan for target customer
Top IT temp staffing companies in India
Total temp staff in IT in India
How do I know the customer needs
Scholarly articles on business entrepreneurship
Product information, addresses from www
Collecting/harvesting data from websites and collating, cleansing and delivering to clients
Where is the resource for information?
Where info is available, how to get data stream into our database
How cost effective, credible, valuable is the data
Accessibility
About companies wanting to enter India -setup operations, joint ventures
Companies in India wanting to enter other geographies
Consultants from outside India needing partners in India
Relevant, accurate data (specific to the task at hand)
Info about prospective customers
Info about vendors
Info about current market
Info about latest technology

Here are my list of information requirements (I took the survey along with others)

Leads
Trends
Best practices

What are your problems with information?

Locating the right data at the right time
At times info overload
Unable to get the right (specific) information
Sometimes get caught into loads of data, making it difficult to sift through
Credibility, cost and accessibility
Frequent website updates
Different formats of information
Gettting data from complex templates and grouping into finite categories
Precision, very difficult to get objective information
Currency of data
Comprehensiveness of data
Need continuous monitoring
Information overload and in such case, synthesizing & assimilating that information in a reasonable time frame is difficult
Old data, not accurate
Too much info
Not easily accessible
Irrelevant info
Filter out the actual/real info from a large pool of junk data
Do not have a scope to interact with peers in similar industries
Direct actionable information takes several searches, navigation
How to localize information (assume how to get local information) and get reliable info
How to segregate info from the web
Difficult to put together
If put together, not sure whether it is the updated info
If updated (up to date?) not sure about the integrity of the data source
Availability (sources), Reliability (sources)
Aggregation of data in a presentable manner
Too much information
Unable to identify precise locations quickly
Quality of inputs not high (always)
Too large varied and different
Formats (word, pdf, excel etc. ), hard copies, books, magazines
Difficult to authenticate, collate and organize based on requirement
I like websearch engines but I strongly believe that these search engines are at a nascent stage. I just don’t need a site coming up in my search because it is in wikipedia or yahoo
Inappropriate not timely
Have to go through lots of notes/documents/pages to get a single piece of information
Validating the information
Storing and organizing information
Time
Where to see (sources?)
Not a centralized reporting
Assimilation requires a lot of pre-formatting
Effective and speed search by everyone not followed
Not sure what to look for, where to look for and how to get it
Vast, use software to target timely, quick, on realtime
Not able to source the information in the web
We develop products based on blogs and emails. This is not enough.
Too much info
Info with noise

My List

Signal vs noise
Reliability
Authenticity

What tools do you use?

Blog, forums
Google, web search
Search engines
Reliable third parties
Friends
Regular expressions
Use bookmarking tools like delicious, share with team
Knowledge repositories (wikipedia
Books (online/printed)
Inhouse tools to capture through automation
Infosource – www, infoanalysis – spreadsheets
Search engines to identify information
Customized perl/php/vb.net programs to manage
Scrape information from the web and manage it
Search engines
Networking sites (LinkedIn etc)
Forums
Email
My brain power, word/excel
justdial and few others provide localized service over phone but it is not so accurate
Justdial
Hakia
None
Excel/Computer/Notebooks
Peer discussions
IE Favorites (browser bookmarks)
Bing
Primary Research
Internet, newspapers, meeting – software modules
spreadsheet, email
Internet, libraries
Getting logic from other tools and using our own tools or languages
Perl, regex
Paid portals
LinkedIn
Spoke
Ecademy
Xing
My memory (sigh)

What I use:

Social bookmarks (delicious, stumble upon)
Twitter Search
Facebook groups
LinkedIn Groups and Answers
Custom search
Blog/Feed Search
Twine
Semantic Search engines
InfoMinder
InfoStreams (feed aggregator/search)
InfoPortals (just started)
Tag clouds (generated)
Concept Mapping tools
OpenCalais
Zemanta
Wikis

This is a small sample (about 40+ people who attended my talk). But you can see some patterns. I think we have a long way to go beyond search.

Information Tracking, Intelligence

Implementing an Innovation Process

Posted on January 4, 2019March 31, 2019 by Dorai Thodla

I came across this nice blog on Innovation Process Framework, by Jeffrey Philips (via Innovation Weblog)

The blog is a nice read and tries to outline a framework for Repeatable Innovation. Towards the end Jeffrey appeals to the readers to provide feedback.

If you care to, please comment or provide your feedback. I think if we practitioners, consultants and interested bystanders can create a consistent vision for the future of innovation and the tools and processes necessary for success, we can help our clients and business partners become more successful.

I have been experimenting with a few tools and some ad-hoc processes for innovation (in small product groups). So let me start out with a few tools and see how we can start putting together, elements of this framework.

You can start with any simple content management system (Drupal, Plone, Dotnet Nuke or even a Wikimedia engine). It is also possible to use commercial portal products like Sharepoint, BEA or IBM portal servers. Let us see how we can go about building a prototype of the tools required to bootstrap your Innovation Process based on the framwork described by Jeffrey.

1. Trend Spotting

You can use several products that exist in the marketplace to track trends. The tools I list here provide you information to detect trends. Here is a list.

Google Alerts- A service to receive alerts based on certain keywords
InfoMinder – Our product to track specific web pages for changes (you can optionally specify filters) and receive notification. Unlike Google or alerts, InfoMinder is specific to the pages you want to track.
Digg, delicious, Techmeme, reddit or any of your favorite social bookmarking service (you can look for specific trends or retrieve information using tags)
Technorati or Google Blog Search tools
Tag Clouds (many of the services mentioned above provide tag clouds that tell you the more popular trends) or you can create your own tag clouds.
Google Trends – A product from Google that allows you to see trends based on searches
A set of high level Text Mining and Tech mining tools ( a subject that deserves almost a blog of its own)

A combination of these services and other customer serivces, can be used to perform trend capture. You need to figure out a way to make sense of trends from these different pieces of information (Trend Spotting). Fortunately many of these tools provide RSS streams or APIs. You can easily integrate them with several content management systems.

2. Generate Ideas

You can set up a workflow where people with the role of Generators, look at the captured trend information, combine it with other sources and generate ideas. These can be either stored in any relational database like MySQL, Postgres SQL.

3. Capture additional Information
In the system, Ideas are just a specific type of document with certain metadata like creator, date of creation, source of idea, description etc. It will be nice to add the capability for anyone to tag ideas. Based on tags and other criteria, ideas can be routed to Evaluators.

4. Evaluate Ideas
The evaluators can add comments, additional tags, classify the ideas to be further researched and send them back into the system. With each iteration, the circle widens. Ideas are further validated, combined with others or split into multiple ideas and put back into the system. Since Ideas trigger ideas, this process of combining and splitting will work well.

5. Develop and Launch

Stakeholders are found, prototypes built, ideas developed and launched as products/services.Your content management system can be used as a record keeper in this phase. In every step of the process from ideation to launch, it may be worth engaging small communities of users. Connecting to social tools like Twitter, Facebook, LinkedIn may be a good way to build and grow these communities.

6. Workflow/Process Automation

This is functionality built into several content management systems. Ideas can move from one stage to another (nascent, researched, validated etc.)

7. Idea Archetypes

One of the important aspects of the design of Idea Archetype is the progressive addition of information. Some ideas are listed here:

State – specifies the current stage of the idea. As it goes through the system, the state of the idea keeps changing
Strength – an indicator of the strength of the idea. As ideas float through the system and gather support, the strength can be progressively increased. Support increases this value and opposition decreases this value.
Next Steps – For each idea there can be a sequence of steps which can be started by the creator of the idea and collaboratively edited by others. For example, the legal department may add a patent search as a next step

8. Process Maps

Argument maps, Concept maps and other mapping tools can be loosely integrated (most of them export data in XML, JSON or CSV format).

9. IdeaLogs

Ideas can also be published in blogs (private if they are meant for a small internal groups). Many portal products or content mangement systems come with their own blog software. You can also integrate some of the popular blogging software like WordPress.

10. Wikis as Collaborative Knowledge Bases

Wikis can be used as a knowledge bases to share, collaboratively edit and archive ideas. Wikis are alternative to idea archetypes, mentioned earlier. Many of the wikis now provide templates for creating structured pages.

Any portal framework that supports content management, custom content types, workflow, collaboration, authentication can be used to jump start the Innovation Process in an organization. It is easy to bootstrap an innovation process using this framework and existing tools in a few weeks.

The best approach is to start with something as simple as a portal, set up some simple workflows, use a single page with extensible metadata as a basis for collaboration.

Update

Pretty much everything I described here can be done using many other portal frameworks, as well. One of recent favorites is Drupal especially since it has started providing support for RDF ( core language for the semantic web as well). You can also custom build this framework using web frameworks like Rails(Ruby), Django(Python).

Information Tracking, Intelligence

Tech Mining

Posted on January 4, 2019March 19, 2019 by Dorai Thodla

I have been a reading a book called Tech Mining. I was planning to write a few blogs after finishing the book. But the whole purpose of my learn log is to (b)log as I learn. So here some information from the first couple of chapters.

According to the authors, various types of Technology Analyses can be aided by tech mining.

1. Technology Monitoring(also known as technology watch or environmental scanning) – cataloguing, characterizing, and interpreting technology development activities.
2. Competitive Technology Intelligence(CTI) – finding out “Who is doing what?”
3. Technology Forecasting – anticipating possible future development path for particular technologies
4. Technology Roadmapping – tracking evolutionary steps in related technologies and, sometimes, product families.
5. Technology Assessment – anticipating the possible, unintended, indirect, and delayed consequences of particular technology changes.
6. Technology Foresight – startegic planning(especially national) with emphasis on technology roles and priorities
7. Technology Process Managment – getting people making decisions about technology
8. Science and Technology Indicators – time series that track advances in national (or other) technological capabilities.

We do a bit of the first activity with our product InfoMinder, but have a long way to go in provide the other capabilities mentioned above. We do plan to help customers set up Information Portals to store the tracked information and do some automatic linking.

Information Tracking, Intelligence

Applied XML: A new Webservice API for Books

Posted on January 4, 2019March 19, 2019 by Dorai Thodla

It is interesting to track new APIs and mashups at Programmable Web. I think it is one of the most useful resources on the web. Using these APIs you can build your own little applications called mashups.

Today I came across one that shows an interesting trend developing. It is a programming interface for accessing books, subjects, publishers and authors. The site lists some impressive statistics:

Statistics (03/01/2006):
•	Books:	2,052,741
•	Subjects:	1,023,326
•	Authors:	750,740
•	Publishers:	149,800
•	Sources:	3,360,800

Associated with the API are a few xml vocabularies.

An XML vocabulary is useful for exchanging data in a common format. When data is stored in XML format, it is more amenable to access and manipulation. As these formats spread in popularity, there is a likelyhood that more people will start using these formats. These are the baby steps in the making of a data web.