Paper Title
WEB DATA EXTRACTION FROM MULTIPLE DATA SOURCES USING QUERY FORMULATION LANGUAGE
Abstract
A query formulation language is presented in orderto easily query and blend structured data on the web. The
main freshness of this is that it allows people with restricted IT skills to explore and query one (or multiple) data
sources without prior knowledge about the schema, structure, terminology, or any technological details of these sources.
Data source considered may be either an offline or inline schema. This may need several language-design and performance
complication that I basically need to deal with. I select querying RDF, as it is the most primitive data model; Ngram models
of Natural Language, used for predicting the class of the words given in the input query. The words of the query may be
classified into noun phrase, verbs and adjectives for understanding the context of the query. Using this we can group the
query syntactically as well as semantically. The former demonstrates how MashQL can be used to queryand mash up the
Data web. To end, I evaluate MashQL on querying two data sets, DBLP and DBPedia, and show that our indexing
techniques allow instant user interaction Keywords- Query formulation, semanticdata web, RDF, NLP