2.安装Spark与Python练习
2022/3/6 20:15:31
本文主要是介绍2.安装Spark与Python练习,对大家解决编程问题具有一定的参考价值,需要的程序猿们随着小编来一起学习吧!
一、安装Spark
检查基础环境hadoop,jdk
配置文件
试运行Python代码
二、Python编程练习:英文文本的词频统计
准备文本文件:heal-the-world.txt
点击查看代码
There's a place in your heart And I know that it is love And this place could be much brighter than tomorrow And if you really try You'll find there's no need to cry In this place you'll feel There's no hurt or sorrow There are ways to get there If you care enough for the living Make a little space Make a better place Heal the world Make it a better place For you and for me And the entire human race There are people dying If you care enough for the living Make it a better place For you and for me If you want to know why There's a love that cannot lie Love is strong It only cares for joyful giving If we try we shall see In this bliss we cannot feel Fear or dread We stop existing and start living Then it feels that always Love's enough for us growing Make a better world Make a better world Heal the world Make it a better place For you and for me And the entire human race There are people dying If you care enough for the living Make a better place for you and for me And the dream we were conceived in Will reveal a joyful face And the world we once believed in Will shine again in grace Then why do we keep strangling life Wound this earth, crucify its soul Though it's plain to see This world is heavenly be god's glow We could fly so high Let our spirits never die In my heart I feel you are all my brothers Create a world with no fear Together we’ll cry happy tears We see the nations turn their swords into plowshares We could really get there If you cared enough for the living Make a little space To make a better place Heal the world Make it a better place For you and for me And the entire human race There are people dying If you care enough for the living Make a better place for you and for me Heal the world Make it a better place For you and for me And the entire human race There are people dying If you care enough for the living Make a better place for you and for me Heal the world Make it a better place For you and for me And the entire human race There are people dying If you care enough for the living Make a better place for you and for me There are people dying If you care enough for the living Make a better place for you and for me There are people dying If you care enough for the living Make a better place for you and for me You and for me You and for me You and for me You and for me
读文件,预处理:大小写,标点符号,停用词,分词 main.py
点击查看代码
with open("Under the Red Dragon.txt", "r") as f: text=f.read() text = text.lower() for ch in '!@#$%^&*(_)-+=\\[]}{|;:\'\"`~,<.>?/': text=text.replace(ch," ") words = text.split() # 以空格分割文本 stop_words = [] with open('stop_words.txt','r') as f: # 读取停用词文件 for line in f: stop_words.append(line.strip('\n')) afterwords=[] for i in range(len(words)): z=1 for j in range(len(stop_words)): if words[i]==stop_words[j]: continue else: if z==len(stop_words): afterwords.append(words[i]) break z=z+1 continue
统计每个单词出现的次数,按词频大小排序,结果写文件 main.py
点击查看代码
counts = {} for word in afterwords: counts[word] = counts.get(word, 0) + 1 items = list(counts.items()) items.sort(key=lambda x: x[1], reverse=True) f1 = open('count.txt', 'w') for i in range(len(items)): word, count = items[i] f1.write(word+" "+str(count)+"\n")
输出结果
点击查看代码
a 22 make 18 place 17 living 10 world 10 care 8 people 7 dying 7 s 7 human 5 heal 5 entire 5 race 5 love 4 feel 3 heart 2 fear 2 space 2 ll 2 i 2 cry 2 joyful 2 crucify 1 create 1 we’ll 1 existing 1 high 1 fly 1 earth 1 face 1 find 1 turn 1 nations 1 spirits 1 ways 1 god 1 swords 1 wound 1 start 1 tomorrow 1 cared 1 brighter 1 tears 1 bliss 1 heavenly 1 glow 1 sorrow 1 reveal 1 plowshares 1 shine 1 life 1 brothers 1 lie 1 conceived 1 stop 1 hurt 1 believed 1 feels 1 strangling 1 strong 1 grace 1 plain 1 soul 1 cares 1 dread 1 happy 1 die 1 growing 1 giving 1 dream 1
三、使用PyCharm搭建编程环境:Ubuntu 16.04 + PyCharm + spark
这篇关于2.安装Spark与Python练习的文章就介绍到这儿,希望我们推荐的文章对大家有所帮助,也希望大家多多支持为之网!
- 2025-01-03用FastAPI掌握Python异步IO:轻松实现高并发网络请求处理
- 2025-01-02封装学习:Python面向对象编程基础教程
- 2024-12-28Python编程基础教程
- 2024-12-27Python编程入门指南
- 2024-12-27Python编程基础
- 2024-12-27Python编程基础教程
- 2024-12-27Python编程基础指南
- 2024-12-24Python编程入门指南
- 2024-12-24Python编程基础入门
- 2024-12-24Python编程基础:变量与数据类型