BIG DATA : What is Big Data?
Big
data is a collection of data sets, it is more large and complex that it becomes
difficult to process using on-hand database management tools.
The
challenges include capture, data
curation, storage, search, sharing, analysis, and visualization.
The
trend to larger data sets is due to the additional information derivable from
analysis of a single large set of related data, as compared to separate smaller
sets with the same total amount of data, allowing correlations to be found to
"spot business trends, determine quality of research, prevent diseases,
link legal citations, combat crime, and determine real-time roadway traffic
conditions.
Big
data is the realization of greater business intelligence by storing,
processing, and analyzing data that was previously ignored due to the
limitations of traditional data management technologies.
Characteristics of Big Data
•
Volume : Large volumes of data.
•
Velocity : Quickly moving data.
•
Variety : Structured, unstructured, images, etc.
•
Veracity :
Trust and integrity is a challenge and a must and is important for big data just as for traditional relational Databases.
•
Value :
Big Data is having no use of unless we can turn it into value.
Three Types of Data:
·
Structured data :
Relational data.
·
Semi Structured data :
XML data.
·
Unstructured data :
Word, PDF, Text, Media Logs.
The Four
Dimensions of Use in Big Data
The
users want to interact with their data,
ü Totality:
Users have an increased desire to process and analyze all available data.
ü Exploration:
Users apply analytic approaches where the schema is defined in response to the
nature of the query.
ü Frequency:
Users have a desire to increase the rate of analysis in order to generate more
accurate and timely business intelligence.
ü Dependency:
Users’ need to balance investment in existing technologies and skills with the
adoption of new techniques.
Tools and System :
} Hands-on
System
·
mySQL
·
MapReduce (YARN)
·
HDFS
·
Hbase
·
DynamoDB
·
Cassandra
·
Memcached
·
Redis
·
MongoDB
·
Pig
·
HIVE
·
Impala
·
Mahout
·
Spark
} Design
Knowledge
·
BigTable
·
Dynamo
·
Dremel
·
Spanner
·
Storm




