Program4

4. Develop a program to print 10 most frequently appearing words in a text file. [Hint: Use a dictionary with distinct words and their frequency of occurrences. Sort the dictionary in the reverse order of frequency and display the dictionary slice of the first 10 items.

Objective of the Program

  • To understand file handling in Python

  • To use dictionary for frequency counting

  • To perform sorting based on values

  • To slice sorted data


Logical Construction (Think Before You Code)

Let’s understand the problem carefully.

Example Text File Content

python is easy
python is powerful
python is popular

Distinct words: python, is, easy, powerful, popular

Frequencies:
python → 3
is → 3
easy → 1
powerful → 1
popular → 1

We must:

  1. Open the file

  2. Read the content

  3. Convert text to lowercase (to avoid duplicate words like Python/python)

  4. Split into words

  5. Remove punctuation

  6. Count frequency using dictionary

  7. Sort dictionary by frequency in descending order

  8. Print first 10 words

👉 Key Idea:
Use dictionary → sort by values → slice first 10.


Flowchart (Block Diagram of Logic)



Flowchart Explanation

  • Start

  • Open file

  • Read content

  • Convert to lowercase

  • Split into words

  • Count frequencies using dictionary

  • Sort by frequency (descending)

  • Print top 10 words

  • Stop


Algorithm

  1. Start

  2. Open text file in read mode

  3. Read file content

  4. Convert text to lowercase

  5. Split text into words

  6. Create empty dictionary freq

  7. For each word in list:

    • If word in dictionary → increment count

    • Else → add word with count 1

  8. Sort dictionary items in descending order

  9. Print first 10 elements

  10. Stop


Python Program

⚠ Students must complete the missing parts.


# Program to print 10 most frequent words in a file


filename = input("Enter file name: ")


with open(filename, "r") as file:

    text = file.read()


text = text.lower()

words = text.split()


freq = {}


for word in words:

    if word in freq:

        freq[word] = freq[word] + __________

    else:

        freq[word] = __________


# Sort dictionary by frequency (descending)

sorted_words = sorted(freq.items(), key=lambda x: x[1], reverse=__________)


print("Top 10 Most Frequent Words:")


for word, count in sorted_words[:__________]:

    print(word, ":", count)


Probable Output

Assume file content:
Python is easy. Python is powerful. Python is popular.

Output:
Top 10 Most Frequent Words:
python : 3
is : 3
easy : 1
powerful : 1
popular : 1

Important Concepts Used

  • File handling (open()read())

  • String methods (lower()split()strip())

  • Dictionary

  • Lambda function

  • Sorting with sorted()

  • List slicing


Viva Voce Questions

  1. What is file handling?

  2. Why do we convert text to lowercase?

  3. What does split() do?

  4. Why do we remove punctuation?

  5. What does lambda x: x[1] mean?

  6. What is dictionary .items()?

  7. What is slicing in Python?



Procedure to Execute the Program

  1. Create a text file (example: sample.txt)

  2. Write some paragraph text inside

  3. Save the file in same folder as program

  4. Run the Python program

  5. Enter file name when prompted


Google Colab – Online Execution

👉 Click the link below to run the complete working version of this program online:
[https://colab.research.google.com/drive/15lkoP6NmD2JVUOask7W1vOx-DDVJbAv5?usp=sharing]


Program Explanation




Assignment for Students

  • Ignore stop words (like “the”, “is”, “and”)

  • Display results in tabular format

  • Save output to another file

  • Handle file-not-found error using try-except



Lab Record Instructions

Right Side of the Record
1. Problem Statement
2. Python Program
Write the following neatly on the right-hand side page:


Left Side of the Record
Write the following neatly on the left-hand side page:
3. Flowchart
4. Algorithm
5. Output
⚠️ Students must write ALL possible outputs.



-:END:-

Comments

Popular posts from this blog