Understanding Data

This page contains the NCERT Computer Science class 12 chapter 7 Understanding Data. You can find the solutions for the chapter 7 of NCERT class 12 Computer Science Exercise. So is the case if you are looking for NCERT class 12 Computer Science related topic Understanding Data questions and answers for the Exercise
Exercise
Question 1
1. Identify data required to be maintained to perform the following services:
a)
Declare exam results and print e-certificates.
b)
Register participants in an exhibition and issue biometric ID cards.
c)
To search for an image by a search engine.
d)
To book an OPD appointment with a hospital in a specific department.
Answer 1
a) Declare exam results and print e-certificates
StudentName, RollNo, Class/Section, SchoolName.
Subject-wise Marks, TotalMarks, Percentage, Grade/ResultStatus (Pass/Fail).
ResultDate, Board/ExamName, Session/Year.
CertificateID/SerialNo, IssueDate, QR code / verification link.
Digital signature / authority name and designation.
b) Register participants in an exhibition and issue biometric ID cards
ParticipantName, Father/MotherName, DateOfBirth, Gender.
Address, MobileNo, EmailID.
Photo, ID proof number (Aadhaar/Passport etc.).
Biometric data (Fingerprint/Iris template), BiometricID.
RegistrationID, EntryPassType, Validity dates.
c) To search for an image by a search engine
Image URL / file path, Image title/caption, Alt text / tags / keywords.
Metadata: image type (JPEG/PNG), size (KB/MB), resolution.
Upload date, source website/page URL.
(Optional) extracted features for matching/similarity.
d) To book an OPD appointment with a hospital in a specific department
PatientName, Age, Gender, MobileNo, Address, PatientID.
DepartmentName, DoctorName/DoctorID, Available slots (date/time).
AppointmentDateTime, TokenNo, PaymentStatus (if any).
Symptoms/Reason for visit (brief), Insurance details (optional).
Question 2
2. A school having 500 students wants to identify beneficiaries of the merit-cum means scholarship, achieving more than 75% for two consecutive years and having family income less than 5 lakh per annum. Briefly describe data processing steps to be taken by the school to beneficial prepare the list of school.
Answer 2
Goal (conditions)
More than 75% for two consecutive years.
Family income < 5 lakh per annum.
Data Processing Cycle / Steps
Data Collection: Collect marks/percentage for last 2 years and family income details for all 500 students.
Data Preparation: Clean data (missing income, wrong percentages), keep a consistent format, remove duplicates.
Data Entry: Enter into a spreadsheet/CSV/database with fields like StudentName, RollNo, %Year1, %Year2, FamilyIncome.
Store & Retrieve: Store dataset and retrieve it whenever filtering is required.
Processing / Classify: Filter students where %Year1 > 75 and %Year2 > 75, then select those with FamilyIncome < 500000.
Output (Reports/Results): Generate the final beneficiary list and a summary (e.g., total beneficiaries).
Question 3
3. A bank “xyz” wants to know about its popularity among the residents of a city “ABC” on the basis of number of bank accounts each family has and the average monthly account balance of each person. Briefly describe the steps to be taken for collecting data and what results can be checked through processing of the collected data.
Answer 3
A) Steps for collecting data
Decide data fields: FamilyID, FamilySize, NumberOfAccountsInFamily, and for each person: PersonID, Age, MonthlyAvgBalance.
Data sources: Bank records (accounts and balances) and a survey/registration drive to link families correctly if needed.
Data entry & storage: Store as structured data in tables (rows/columns).
B) Results you can check after processing
Popularity by coverage: Percentage of families having at least 1 account in bank xyz.
Average accounts per family: Mean of NumberOfAccountsInFamily.
Dominant category: Most common number of accounts per family (mode).
Average monthly balance: Overall mean balance per person; compare mean balance across areas/age groups.
Variation in balances: Use standard deviation to measure spread.
Question 4
4. Identify type of data being collected/generated in the following scenarios:
a)
Recording a video.
b)
Marking attendance by teacher.
c)
Writing tweets.
d)
Filling an application form online.
Answer 4
a) Recording a video: Unstructured data (multimedia: frames + audio; not fixed rows/columns).
b) Marking attendance by teacher: Structured data (table like: RollNo, Name, Date, Present/Absent).
c) Writing tweets: Unstructured data (text + emojis + images/videos; no fixed structure).
D) Filling an application form online: Mostly structured data (fixed fields like name, DOB, address, etc.) stored in a table/database.
Question 5
5. Consider the temperature (in Celsius) of 7 days of a week as 34, 34, 27, 28, 27, 34, 34. Identify the appropriate statistical technique to be used to calculate the following:
a)
Find the average temperature.
b)
Find the temperature range of that week.
c)
Find the standard deviation temperature.
Answer 5
a) Find the average temperature: Use Mean (Average).
b) Find the temperature range of that week: Use Range = Maximum − Minimum.
c) Find the standard deviation temperature: Use Standard Deviation (spread/variation around mean).
Python Program + Sample Output
# Temperature data for 7 days (in Celsius)
temps = [34, 34, 27, 28, 27, 34, 34]

# 1) Mean (Average)
mean_temp = sum(temps) / len(temps)

# 2) Range = max - min
temp_range = max(temps) - min(temps)

# 3) Population Standard Deviation (divide by n)
#    Steps:
#    - find squared difference of each value from mean
#    - take average of those squared differences
#    - take square root
squared_diffs = []
for x in temps:
    squared_diffs.append((x - mean_temp) ** 2)
variance = sum(squared_diffs) / len(temps)
std_dev = variance ** 0.5

print("Temperatures:", temps)
print("Mean (Average):", round(mean_temp, 2))
print("Range:", temp_range)
print("Standard Deviation:", round(std_dev, 2))
Sample Output
Temperatures: [34, 34, 27, 28, 27, 34, 34]
Mean (Average): 31.14
Range: 7
Standard Deviation: 3.31
Question 6
6. A school teacher wants to analyse results. Identify the appropriate statistical technique to be used along with its justification for the following cases:
a)
Teacher wants to compare performance in terms of division secured by students in Class XII A and Class XII B where each class strength is same.
b)
Teacher has conducted five unit tests for that class in months July to November and wants to compare the class performance in these five months.
Answer 6
a) Compare performance in terms of division secured by Class XII A and XII B (same strength)
Use Mode (and frequency count) because “division” (First/Second/Third) is categorical.
Compare the most frequent division in each class to judge performance.
b) Five unit tests (July to November) compare class performance in these five months
Use Mean (average marks) for each unit test month to compare overall performance.
If extreme outliers exist, also check Median.
Question 7
7. Suppose annual day of your school is to be celebrated. The school has decided to felicitate those parents of the students studying in classes XI and XII, who are the alumni of the same school. In this context, answer the following questions:
a)
Which statistical technique should be used to find out the number of students whose both parents are alumni of this school?
b)
How varied are the age of parents of the students of that school?
Answer 7
a) Technique to find number of students whose both parents are alumni
Create a field like BothParentsAlumni = Yes/No.
Use frequency/count of “Yes” (same idea as mode, which is frequency-based).
b) How varied are the ages of parents?
Use Standard Deviation to measure how spread out the ages are.
You can also mention Range for quick spread: max age – min age.
Question 8
8. For the annual day celebrations, the teacher is looking for an anchor in a class of 42 students. The teacher would make selection of an anchor on the basis of singing skill, writing skill, as well as monitoring skill.
a)
Which mode of data collection should be used?
b)
How would you represent the skill of students as data?
Answer 8
a) Which mode of data collection should be used?
Use observation + practical try-out (audition).
Ask students to perform anchoring (short hosting task).
Observe singing/writing/monitoring skills and record scores.
b) How would you represent the skill of students as data?
Make a structured data table (rows = students, columns = skills).
Example fields: StudentName, RollNo, SingingSkillScore (0-10), WritingSkillScore (0-10), MonitoringSkillScore (0-10), Remarks (optional).
Question 9
9. Differentiate between structured and unstructured data giving one example.
The principal of a school wants to do following analysis on the basis of food items procured and sold in the canteen:
a)
Compare the purchase and sale price of fruit juice and biscuits.
b)
Compare sales of fruit juice, biscuits and samosa.
c)
Variation in sale price of fruit juices of different companies for same quantity (in ml).
Create an appropriate dataset for these items (fruit juice, biscuits, samosa) by listing their purchase price and sale price. Apply basic statistical techniques to make the comparisons.
Answer 9
Structured vs Unstructured Data
Basis
Structured Data
Unstructured Data
Meaning
Organized in a well-defined format, usually rows and columns, with attributes/variables.
Does not follow traditional row-column structure; no fixed pattern (like newspaper/email/web pages).
Example
Attendance sheet / student marks table.
A video file / tweet / web page with text + images.
A) Sample dataset (structured)
Store purchase price, sale price, and units sold for each item to compare prices and sales.
Category
Item
Company
Qty (ml)
Purchase Price
Sale Price
Units Sold
Fruit Juice
Juice
Real
200
18
25
40
Fruit Juice
Juice
Tropicana
200
20
28
35
Fruit Juice
Juice
MinuteMaid
200
19
26
30
Biscuits
Parle-G
5
8
120
Biscuits
Oreo
18
25
60
Snack
Samosa
7
12
150
Python Program + Sample Output
# Canteen dataset (sample)
# For fruit juice, we include multiple companies (same quantity)
# to study variation in sale price.

items = [
    # category, item, company, qty_ml, purchase_price, sale_price, units_sold
    ("Fruit Juice", "Juice", "Real",       200, 18, 25, 40),
    ("Fruit Juice", "Juice", "Tropicana",  200, 20, 28, 35),
    ("Fruit Juice", "Juice", "MinuteMaid", 200, 19, 26, 30),

    ("Biscuits", "Parle-G", None, None, 5, 8, 120),
    ("Biscuits", "Oreo",    None, None, 18, 25, 60),

    ("Snack", "Samosa", None, None, 7, 12, 150),
]

# (a) Compare purchase and sale price of fruit juice and biscuits
print("A) Purchase vs Sale (Profit per unit)")
print()
for cat, name, comp, qty, buy, sell, sold in items:
    if cat in ("Fruit Juice", "Biscuits"):
        label = (
            cat + " - " + name
            if comp is None
            else cat + " - " + comp + " (" + str(qty) + "ml)"
        )
        profit = sell - buy
        line = (
            label
            + " Buy=" + str(buy)
            + "  Sell=" + str(sell)
            + "  Profit=" + str(profit)
        )
        print(line)

# (b) Compare sales of fruit juice, biscuits and samosa
# (total units sold per category)
sales_by_cat = {}
for cat, name, comp, qty, buy, sell, sold in items:
    sales_by_cat[cat] = sales_by_cat.get(cat, 0) + sold

print()
print("B) Total units sold by category")
print()
for cat in sorted(sales_by_cat):
    print(
        cat + ": " + str(sales_by_cat[cat]) + " units"
    )

# (c) Variation in sale price of fruit juices
# (different companies, same quantity)
fruit_sales_prices = [
    sell
    for cat, name, comp, qty, buy, sell, sold in items
    if cat == "Fruit Juice"
]

mean_price = sum(fruit_sales_prices) / len(fruit_sales_prices)
price_range = max(fruit_sales_prices) - min(fruit_sales_prices)
variance = (
    sum((x - mean_price) ** 2 for x in fruit_sales_prices)
    / len(fruit_sales_prices)
)
std_dev = variance ** 0.5

print()
print("C) Variation in sale price (Fruit Juice, 200ml)")
print()
print("Sale prices:", fruit_sales_prices)
print("Mean sale price:", round(mean_price, 2))
print("Range:", price_range)
print("Standard Deviation:", round(std_dev, 2))
Sample Output
A) Purchase vs Sale (Profit per unit)

Fruit Juice - Real (200ml) Buy=18  Sell=25  Profit=7
Fruit Juice - Tropicana (200ml) Buy=20  Sell=28  Profit=8
Fruit Juice - MinuteMaid (200ml) Buy=19  Sell=26  Profit=7
Biscuits - Parle-G Buy=5  Sell=8  Profit=3
Biscuits - Oreo Buy=18  Sell=25  Profit=7

B) Total units sold by category

Biscuits: 180 units
Fruit Juice: 105 units
Snack: 150 units

C) Variation in sale price (Fruit Juice, 200ml)

Sale prices: [25, 28, 26]
Mean sale price: 26.33
Range: 3
Standard Deviation: 1.25
How this matches basic statistical techniques
Mean gives average sale price.
Range gives spread using max – min.
Standard deviation shows variation using all values.