Exam Details
Subject | data mining | |
Paper | ||
Exam / Course | mca | |
Department | ||
Organization | Gujarat Technological University | |
Position | ||
Exam Date | May, 2018 | |
City, State | gujarat, ahmedabad |
Question Paper
1
Seat No.: Enrolment
GUJARAT TECHNOLOGICAL UNIVERSITY
ME SEMESTER- I • EXAMINATION SUMMER 2018
Subject Code:2712309 Date: 10/05/2018
Subject Name: Data Mining
Time: 02:30 PM to 05:00 PM Total Marks: 70
Instructions:
1. Attempt all questions.
2. Make suitable assumptions wherever necessary.
3. Figures to the right indicate full marks.
Q.1
List out the different schema of data warehouse and explain one of the data warehouse schemas in detail with suitable diagram.
07
Compare frequent item set generation using FP growth algorithm with Apriori algorithm.
07
Q.2
Define the term "Data Mining". With the help of a suitable diagram explain the process of knowledge discovery from databases
07
What is data cleaning? Describe the different methods of handling missing values during data cleaning
07
OR
What is data transformation? Explain the different data transformation approaches for transforming data.
07
Q.3
What is Concept Hierarchy? List and explain different types of Concept Hierarchy.
07
What is an How do outliers impact the results of mining? Explain any one method to detect outliers.
07
OR
Q.3
State the Apriori Property. Generate large itemsets and association rules using Apriori algorithm on the following data set having minimum support value as 2 and minimum confidence value as 75%.
TID
Items Purchased
T101
Cheese, milk, cookie
T102
Butter, milk, bread
T103
Cheese, butter, milk, bread
T104
Butter, bread
07
Explain methods for data normalization for age values as: 13, 15, 16, 16, 19, 20, 23, 29, 35, 41, 44, 53, 62, 69, 72. Transform age value 45 for all methods.
For min-max normalization consider range as 1.0].
For z-score normalization standard deviation of age is 20.64 years.
For decimal scaling make necessary assumption.
07
Q.4
What is supervised and unsupervised learning? What is Cluster Analysis? List and explain requirements of clustering in data mining.
07
Explain with example how continuous numerical data values can be discretized.
07
OR
Q.4
Write steps of K-Means clustering algorithm with its pros and cons. How K-Mean clustering method differs from K-Medoid clustering method?
07
Explain the following attribute selection measures for decision trees:
Information Gain Gain Ratio
07
Q.5
List and explain various OLAP operations.
07
2
What is time series database? How to characterize the time series data using trend analysis?
07
OR
Q.5
Explain different types of web mining with suitable example.
07
Discuss the characteristics and limitations of Neural Networks. Mention any two suitable applications of neural networks.
07
Seat No.: Enrolment
GUJARAT TECHNOLOGICAL UNIVERSITY
ME SEMESTER- I • EXAMINATION SUMMER 2018
Subject Code:2712309 Date: 10/05/2018
Subject Name: Data Mining
Time: 02:30 PM to 05:00 PM Total Marks: 70
Instructions:
1. Attempt all questions.
2. Make suitable assumptions wherever necessary.
3. Figures to the right indicate full marks.
Q.1
List out the different schema of data warehouse and explain one of the data warehouse schemas in detail with suitable diagram.
07
Compare frequent item set generation using FP growth algorithm with Apriori algorithm.
07
Q.2
Define the term "Data Mining". With the help of a suitable diagram explain the process of knowledge discovery from databases
07
What is data cleaning? Describe the different methods of handling missing values during data cleaning
07
OR
What is data transformation? Explain the different data transformation approaches for transforming data.
07
Q.3
What is Concept Hierarchy? List and explain different types of Concept Hierarchy.
07
What is an How do outliers impact the results of mining? Explain any one method to detect outliers.
07
OR
Q.3
State the Apriori Property. Generate large itemsets and association rules using Apriori algorithm on the following data set having minimum support value as 2 and minimum confidence value as 75%.
TID
Items Purchased
T101
Cheese, milk, cookie
T102
Butter, milk, bread
T103
Cheese, butter, milk, bread
T104
Butter, bread
07
Explain methods for data normalization for age values as: 13, 15, 16, 16, 19, 20, 23, 29, 35, 41, 44, 53, 62, 69, 72. Transform age value 45 for all methods.
For min-max normalization consider range as 1.0].
For z-score normalization standard deviation of age is 20.64 years.
For decimal scaling make necessary assumption.
07
Q.4
What is supervised and unsupervised learning? What is Cluster Analysis? List and explain requirements of clustering in data mining.
07
Explain with example how continuous numerical data values can be discretized.
07
OR
Q.4
Write steps of K-Means clustering algorithm with its pros and cons. How K-Mean clustering method differs from K-Medoid clustering method?
07
Explain the following attribute selection measures for decision trees:
Information Gain Gain Ratio
07
Q.5
List and explain various OLAP operations.
07
2
What is time series database? How to characterize the time series data using trend analysis?
07
OR
Q.5
Explain different types of web mining with suitable example.
07
Discuss the characteristics and limitations of Neural Networks. Mention any two suitable applications of neural networks.
07
Other Question Papers
Subjects
- advance database management system
- advanced biopharmaceutics & pharmacokinetics
- advanced medicinal chemistry
- advanced networking (an)
- advanced organic chemistry -i
- advanced pharmaceutical analysis
- advanced pharmacognosy-1
- advanced python
- android programming
- artificial intelligence (ai)
- basic computer science-1(applications of data structures and applications of sql)
- basic computer science-2(applications of operating systems and applications of systems software)
- basic computer science-3(computer networking)
- basic computer science-4(software engineering)
- basic mathematics
- basic statistics
- big data analytics (bda)
- big data tools (bdt)
- chemistry of natural products
- cloud computing (cc)
- communications skills (cs)
- computer aided drug delivery system
- computer graphics (cg)
- computer-oriented numerical methods (conm)
- cyber security & forensics (csf)
- data analytics with r
- data mining
- data structures (ds)
- data visualization (dv)
- data warehousing
- data warehousing & data mining
- database administration
- database management system (dbms)
- design & analysis of algorithms(daa)
- digital technology trends ( dtt)
- discrete mathematics for computer science (dmcs)
- distributed computing (dc1)
- drug delivery system
- dynamic html
- enterprise resource planning (erp)
- food analysis
- function programming with java
- fundamentals of computer organization (fco)
- fundamentals of java programming
- fundamentals of networking
- fundamentals of programming (fop)
- geographical information system
- image processing
- industrial pharmacognostical technology
- information retrieving (ir)
- information security
- java web technologies (jwt)
- language processing (lp)
- machine learning (ml)
- management information systems (mis)
- mobile computing
- molecular pharmaceutics(nano tech and targeted dds)
- network security
- object-oriented programming concepts & programmingoocp)
- object-oriented unified modelling
- operating systems
- operation research
- operations research (or)
- pharmaceutical validation
- phytochemistry
- procedure programming in sql
- programming skills-i (ps-i-fop)
- programming skills-ii (ps-oocp)
- programming with c++
- programming with java
- programming with linux, apache,mysql, and php (lamp)
- programming with python
- search engine techniques (set)
- soft computing
- software development for embedded systems
- software engineering
- software lab (dbms: sql & pl/sql)
- software project in c (sp-c)
- software project in c++ (sp-cpp)
- software quality and assurance (sqa)
- statistical methods
- structured & object oriented analysis& design methodology
- system software
- virtualization and application of cloud
- web commerce (wc)
- web data management (wdm)
- web searching technology and search engine optimization
- web technology & application development
- wireless communication & mobile computing (wcmc)
- wireless sensor network (wsn)