Timmy's Column: Python自動化的樂趣學習筆記 (1)

　　暨〈Python輕鬆上手學課堂筆記〉之後，這本由Al Sweigart著、H&C譯的《Python自動化的樂趣－搞定重複瑣碎&單調無聊的工作》(Automate the Boring Stuff with Python: Practical Programming for Total Beginners)，非常適合初學者閱讀，使用Python打造各式客製化的程式小工具，以下為Timmy閱讀本書第1至10章的學習筆記。

一、Python基礎練習、模組、設定

Python自動化的樂趣－搞定重複瑣碎&單調無聊的工作（Al Sweigart著/H&C譯）

l 本書原文Automate the Boring Stuff with Python

l 本書作者部落格教學指南

l 本書範例檔

使用sys.exit()提前結束程式的執行

l   import sys
while True:
    a = input("Type EXIT to exit.")
    if a == "EXIT":
        sys.exit()
    print("You typed " + a + ".")

random模組

l   使用random.randint()隨機求取兩個整數之間的一個整數值：
import random
for a in range(5):
    # 可抽取1至10共10個數字，且可以重複抽取同一數字
    print(random.randint(1, 10))

l 使用random.shuffle()隨機重新排列串列中的值：
import random
a = [1, 2, 3, 4, 5, 6, 7]
print(a) à [1, 2, 3, 4, 5, 6, 7]
random.shuffle(a)
print(a)

l 使用random.sample()隨機抽取一定數量串列中的值：
import random
a = [1, 2, 3, 4, 5, 6, 7]
# 隨機抽取3個
print(random.sample(a, 3))

l 使用random.choice()隨機抽取串列中的值：
import random
a = ["Hello", "Poor", "World"]
for i in range(10):
print(random.choice(a))

pprint模組

l 如果匯入pprint模組，可使用.pprint()和.pformat()函式幫我們美觀整齊地印出字典的值。

l   計算字元次數：
import pprint
a = "Hello World"
b = {}
for c in a:
    b.setdefault(c, 0)
    b[c] = b[c] + 1
# 字串較長時才能看出整齊排序的結果
# pprint.pprint(b)等同於print(pprint.pformat(b))
pprint.pprint(b)
à {'H': 1, 'e': 1, 'l': 3, 'o': 2, ' ': 1, 'W': 1, 'r': 1, 'd': 1}

pyperclip模組

l 使用pyperclip模組存取電腦系統的剪貼簿，在命令提示字元輸入pip install pyperclip，下載並安裝pyperclip模組。

l 如果匯入pyperclip模組，可使用.copy()和.paste()函式向電腦系統中的剪貼簿複製或貼上文字。

l 新增項目符號：
import pyperclip
a = pyperclip.paste()
b = a.split('\n')
for c in range(len(b)):
b[c] = "* " + b[c]
a = "\n".join(b)
pyperclip.copy(a)

在非Python開發工具中執行程式

l 若想要從命令提示字元或終端模式來執行Python程式，需要在程式碼的第一行加上#!行，不同系統的#!行形式不同，在Windows系統中的#!行是#! python3。

l 使用記事本存檔為pythonScript.bat的批次檔，目的是未來執行Python程式時，不需要輸入檔案路徑，只需要pythonScript.py與pythonScript.bat放在同一個資料夾即可，pythonScript.bat裡面要加入下列程式碼：
@py.exe "C:\Path\to\your\pythonScript.py" %*
@pause

l 未來執行Python程式時，若不想要每次都得輸入檔案路徑，亦需要事先對作業系統進行設定，在Windows系統中的設定如下：
Windows設定 à 編輯系統環境變數 à 系統內容 à 環境變數 à Path à 編輯 à 文字的最後加上;C:\Path\to\your資料夾路徑

l 上述3個步驟皆完成後，按下Win-R，再輸入pythonScript，就會執行pythonScript.bat。

在非Python開發工具中執行程式－專題

l   密碼管理程式：
#! python3
passwords = {"LockerA": "LockerADefaultPassword",
           "LockerB": "LockerBDefaultPassword",
           "LockerC": "LockerCDefaultPassword"}
import sys, pyperclip
# 使用sys.argv存取命令提示行引數，為按下Win-R後，存取輸入的批次檔檔名pythonScript（sys.argv[0]），以及存取批次檔檔名pythonScript後空一格輸入的查詢文字lockerName（sys.argv[1]）
if len(sys.argv) < 2:
    sys.exit()
elif sys.argv[1] in passwords:
    pyperclip.copy(passwords[sys.argv[1]])
    print("Password for " + sys.argv[1] + " Copied to Clipboard")
else:
    print("There\'s No Locker Named " + sys.argv[1])
# 按下Win-R，輸入批次檔檔名，按一次空白鍵，再輸入欲查詢文字，如pythonScript LockerA，即可按下「確定」執行程式

l 擴充版的多重剪貼簿：
完成「三、讀寫檔案」之後，參考「六、Python自動化的樂趣－第8章－實作專題參考程式碼：擴充版的多重剪貼簿」。

二、正規表示式(Regular Expressions, REGEX)

正規表示式相關內容

l Python SoftwareFoundation

l Regular-Expressions.info

比對規則

l 字元|稱為管道(Pipe)，將第一次比對符合的文字當成Match物件返回，如果要比對管道字元，必須用轉義字元\|：
import re
# aRegex存放Regex物件
aRegex = re.compile(r"Hello|World")
# aMatch存放Match物件
aMatch = aRegex.search("Hello World")
print(aMatch) à <_sre.SRE_Match object; span=(0, 5), match='Hello'>
print(aMatch.group()) à Hello
bMatch = aRegex.search("World, Hello")
print(bMatch.group()) à World
bRegex = re.compile(r"Pre(caution|pare|cise)")
cMatch = bRegex.search("Hope for the Best, Prepare for the Worst")
print(cMatch.group()) à Prepare

l 字元?稱為問號，做為可選擇性的比對，如果要比對問號字元，必須用轉義字元\?：
import re
aRegex = re.compile(r"Bat(wo)?man")
aMatch = aRegex.search("Batman Stars in That Film")
print(aMatch.group()) à Batman
bMatch = aRegex.search("Batwoman Stars in That Film")
print(bMatch.group()) à Batwoman

l 字元*稱為星號，比對符合零次或多次，如果要比對星號字元，必須用轉義字元\*：
import re
aRegex = re.compile(r"Bat(wo)*man")
aMatch = aRegex.search("Batman Stars in That Film")
print(aMatch.group()) à Batman
bMatch = aRegex.search("Batwoman Stars in That Film")
print(bMatch.group()) à Batwoman
cMatch = aRegex.search("Batwowoman Stars in That Film")
print(cMatch.group()) à Batwowoman

l 字元+稱為加號，比對符合一次或多次，如果要比對加號字元，必須用轉義字元\+：
import re
aRegex = re.compile(r"Bat(wo)+man")
aMatch = aRegex.search("Batman Stars in That Film")
print(aMatch.group()) à AttributeError
print(aMatch == None) à True
bMatch = aRegex.search("Batwoman Stars in That Film")
print(bMatch.group()) à Batwoman
cMatch = aRegex.search("Batwowoman Stars in That Film")
print(cMatch.group()) à Batwowoman

l 字元^稱為脫字(Caret)，比對找出起始字串：
import re
aRegex = re.compile(r"^Hello")
aMatch = aRegex.search("Hello World")
print(aMatch.group()) à Hello
bMatch = aRegex.search("World, Hello")
print(bMatch == None) à True

l 字元$稱為金錢(Dollar)，比對找出結尾字串：
import re
aRegex = re.compile(r"Hello$")
aMatch = aRegex.search("Hello World")
print(aMatch == None) à True
bMatch = aRegex.search("World, Hello")
print(bMatch.group()) à Hello

l 脫字字元^開頭，金錢字元$結尾，完全比對字串：
import re
aRegex = re.compile(r"^Hello World$")
aMatch = aRegex.search("Hello World")
print(aMatch.group()) à Hello World
bMatch = aRegex.search("Hello Poor World")
print(bMatch == None) à True

比對字元與物件方法

l 字元分類：
\d表示0至9任何數字
\D表示除了0至9數字以外的任何字元
\w表示任何字母、數字或底線字元
\W表示除了字母、數字或底線以外的任何字元
\s表示空格、定位符號或換行符號
\S表示除了空格、定位符號和換行符號以外的任何字元

l Regex物件方法，search()與findall()：
import re
aRegex = re.compile(r"\d+\s\w+")
# search()方法會返回第一個找到的Match物件
aMatch = aRegex.search("Ariana Grande Has Released 3 Albums and 9 Videos")
print(aMatch.group()) à 3 Albums
# 若正規表示式無分組，findall()方法會返回一個字串串列
bMatch = aRegex.findall("Ariana Grande Has Released 3 Albums and 9 Videos")
print(bMatch) à ['3 Albums', '9 Videos']
# 若正規表示式有分組，findall()方法會返回一個Tuple串列
bRegex = re.compile(r"((\d+)\s(\w+))")
cMatch = bRegex.findall("Ariana Grande Has Released 3 Albums and 9 Videos")
print(cMatch) à [('3 Albums', '3', 'Albums'), ('9 Videos', '9', 'Videos')]

l Regex物件方法，sub()：
# sub()方法要傳入兩個引數，第一個引數是用來取代找出的內容，第二個引數是要比對處理的字串
import re
aRegex = re.compile(r"Agent \w+")
aMatch = aRegex.sub("CENSORED", "Agent Wendy Requested Agent Timmy to Help in Case Things Went from Bad to Worse")
print(aMatch) à CENSORED Requested CENSORED to Help in Case Things Went from Bad to Worse
# 部分取代找出的內容
bRegex = re.compile(r"Agent (\w)\w*")
bMatch = bRegex.sub(r"\1****", "Agent Wendy Requested Agent Timmy to Help in Case Things Went from Bad to Worse")
print(bMatch) à W**** Requested T**** to Help in Case Things Went from Bad to Worse

l Match物件方法，group()與groups()：
import re
# 傳入re.compile()原始字串內的$和$會變成實際的(和)
aRegex = re.compile(r"($\d\d$)(\d\d\d\d-\d\d\d\d)")
aMatch = aRegex.search("My Phone Number Is (02)2737-7866")
print(aMatch.group()) à (02)2737-7866
print(aMatch.group(0)) à (02)2737-7866
print(aMatch.group(1)) à (02)
print(aMatch.group(2)) à 2737-7866
print(aMatch.groups()) à ('(02)', '2737-7866')

l 建立字元分類、相反的字元分類與指定範圍的字元分類，在中刮弧[]內不需要使用反斜線\轉譯字元：
import re
# 建立字元分類
aRegex = re.compile(r"[aeiouAEIOU]")
aMatch = aRegex.findall("Ariana Grande Has Released 3 Albums and 9 Videos")
print(aMatch) à ['A', 'i', 'a', 'a', 'a', 'e', 'a', 'e', 'e', 'a', 'e', 'A', 'u', 'a', 'i', 'e', 'o']
# 建立相反的字元分類
bRegex = re.compile(r"[^aeiouAEIOU]")
bMatch = bRegex.findall("Ariana Grande Has Released 3 Albums and 9 Videos")
print(bMatch) à ['r', 'n', ' ', 'G', 'r', 'n', 'd', ' ', 'H', 's', ' ', 'R', 'l', 's', 'd', ' ', '3', ' ', 'l', 'b', 'm', 's', ' ', 'n', 'd', ' ', '9', ' ', 'V', 'd', 's']
# 指定範圍的字元分類
aRegex = re.compile(r"[a-eA-E0-4]")
aMatch = aRegex.findall("My Phone Number Is (02)2737-7866")
print(aMatch) à ['e', 'b', 'e', '0', '2', '2', '3']

l 字元.稱為句點，可比對尋找除了換行符號之外的所有字元，為萬用字元(Wildcard)，如果要比對句點字元，必須用轉義字元\.：
import re
aRegex = re.compile(r".ool")
aMatch = aRegex.findall("A Group of Fools Jump in the Pool on Cool Days")
print(aMatch) à ['Fool', 'Pool', 'Cool']

貪婪比對

l 大括弧{}的數字為比對字元次數：
import re
aRegex = re.compile(r"$\d{2}$\d{4}-\d{4}")
aMatch = aRegex.search("My Phone Number Is (02)2737-7866")
print(aMatch.group()) à (02)2737-7866

l 大括弧{}的第一個數字為下限最小值，第二個數字為上限最大值：
import re
aRegex = re.compile(r"(Hello){,3}")
aMatch = aRegex.search("HelloHello")
print(aMatch.group()) à HelloHello
bRegex = re.compile(r"(Hello){2,}")
bMatch = bRegex.search("HelloHelloHello")
print(bMatch.group()) à HelloHelloHello

l Python的正規表示式預設是貪婪的(Greedy)，所以(Hello){2,4}的條件下，有4個(Hello)就回傳4個，不會回傳2個：
import re
# 貪婪比對(Greedy)
aRegex = re.compile(r"(Hello){2,4}")
aMatch = aRegex.search("HelloHelloHello")
print(aMatch.group()) à HelloHelloHello
# 非貪婪比對(Non-Greedy)
bRegex = re.compile(r"(Hello){2,4}?")
bMatch = bRegex.search("HelloHelloHello")
print(bMatch.group()) à HelloHello

l 萬用字元的貪婪比對與非貪婪比對，.*剛好由.與*兩個符號的特性組成：
import re
# 貪婪比對
aRegex = re.compile(r"<.*>")
aMatch = aRegex.search("<Hello>World>")
print(aMatch.group()) à <Hello>World>
# 非貪婪比對
bRegex = re.compile(r"<.*?>")
bMatch = bRegex.search("<Hello>World>")
print(bMatch.group()) à <Hello>

re.compile()的引數

l 傳入re.DOTALL當作re.compile()的第二個引數，找出換行符號：
import re
aRegex = re.compile(".*")
aMatch = aRegex.search("Hello\nWorld")
print(aMatch.group()) à Hello
bRegex = re.compile(".*", re.DOTALL)
bMatch = bRegex.search("Hello\nWorld")
print(bMatch.group()) à Hello
World

l 傳入re.IGNORECASE或re.I當作re.compile()的第二個引數，比對時不區分英文大小寫：
import re
aRegex = re.compile(r"RageCandyBar", re.I)
aMatch = aRegex.search("RageCandyBar Is a Key Item in HGSS")
print(aMatch.group()) à RageCandyBar
bMatch = aRegex.search("Ragecandybar Is a Key Item in HGSS")
print(bMatch.group()) à Ragecandybar

l   傳入re.VERBOSE當作re.compile()的第二個引數，可進行標註以管理複雜的正規表示式；此外，管道字元|就是位元或(Bitwise Or)運算子，能組合使用re.DOTALL、re.IGNORECASE、re.VERBOSE：
import re
aRegex = re.compile(r"""(
    (TAIPEI)             # City
    (\d{2}|$\d{2}$)    # Area Code
    )""", re.IGNORECASE|re.VERBOSE)
aMatch = aRegex.search("Taipei(02)27377866")
print(aMatch.group()) à Taipei(02)

三、讀寫檔案

路徑

l 使用os.makedirs()實際建立新資料夾：
import os
os.makedirs(r"C:\Hello\Poor\World")
# 資料夾若存在時，防止函式丟出例外，參考「五、除錯(Debugging)」
os.makedirs(r"C:\Hello\Poor\World", exist_ok = True)

l 使用os.getcwd()取得目前工作目錄(Current Working Directory, CWD)的字串值，並可用os.chdir()來切換變更：
import os
print(os.getcwd()) à C:\Users\Timmy\Desktop
os.chdir(r"C:\Windows\System32")
print(os.getcwd()) à C:\Windows\System32
os.chdir(r"C:\ThisFolderDoesNotExist") à FileNotFoundError

l 使用os.path.join()組合檔案路徑的字串，並依Windows、OS X、Linux等不同作業系統的規則來組合資料夾和檔名：
import os
print(os.path.join("Users", "Timmy", "test")) à Users\Timmy\test
print(os.path.join(r"C:\Users\Timmy", "test.py")) à C:\Users\Timmy\test.py

l 相對路徑以點(.)當作目前工作目錄，以點點(..)當作上層目錄，相對路徑的開始處是可選擇使用或不使用的，.\test.txt等同於test.txt：
import os
# 相對路徑轉絕對路徑
print(os.path.abspath(".")) à C:\Users\Timmy\Desktop
print(os.path.abspath(r".\test.py")) à C:\Users\Timmy\Desktop\test.py
# 驗證是否為絕對路徑
print(os.path.isabs(".")) à False
print(os.path.isabs(os.path.abspath("."))) à True

l 使用os.path.relpath(path, start)返回從start到path的相對路徑字串，如果沒有傳入start，使用目前工作目錄做為start：
import os
# 目前工作目錄為C:\Users\Timmy\Desktop
print(os.path.relpath(r"C:\Windows", "C:\\")) à Windows
print(os.path.relpath(r"C:\Windows", r"C:\Hello\World")) à ..\..\Windows
print(os.path.relpath(r"C:\Windows")) à ..\..\..\Windows

l 目錄名稱與基本名稱：
import os
calcPath = r"C:\Windows\System32\calc.exe"
# 印出目錄名稱
print(os.path.dirname(calcPath)) à C:\Windows\System32
# 印出基本名稱
print(os.path.basename(calcPath)) à calc.exe
# 同時印出目錄名稱與基本名稱
print(os.path.split(calcPath)) à ('C:\\Windows\\System32', 'calc.exe')
# 依作業系統不同，照每個資料夾分割開
print(calcPath.split(os.path.sep)) à ['C:', 'Windows', 'System32', 'calc.exe']

l os.path模組相關內容：
Python SoftwareFoundation

檔案

l 檢查路徑的合法性：
import os
# 檢查檔案或資料夾
print(os.path.exists(r"C:\Windows\System32")) à True
print(os.path.exists(r"C:\ThisFolderDoesNotExist")) à False
# 檢查檔案
print(os.path.isfile(r"C:\Windows\System32")) à False
print(os.path.isfile(r"C:\Windows\System32\calc.exe")) à True
# 檢查資料夾
print(os.path.isdir(r"C:\Windows\System32")) à True
print(os.path.isdir(r"C:\Windows\System32\calc.exe")) à False

l 檔案名稱與檔案大小：
import os
# 使用os.listdir()得到檔案名稱的串列
print(os.listdir(r"C:\Users\Timmy\Music\Taylor Swift\1989 [Deluxe Edition]"))
à ['01 Welcome To New York.wma', '02 Blank Space.wma', '03 Style.wma', …]
# 使用os.path.getsize()得到檔案大小的位元數
print(os.path.getsize(r"C:\Windows\System32\calc.exe")) à 26112

純文字檔讀取模式、寫入模式、新增模式

l 記事本是純文字檔(Plaintext File)，而二進位檔(Binary Files)用記事本開啟會呈現亂碼，以下為純文字檔的處理方式。

l 字串讀取模式：
# 使用open()函式開啟檔案，會返回一個File物件
# 預設以讀取模式來開啟檔案，第二個引數"r"其實可以省略
a = open(r"C:\Users\Timmy\Desktop\test.txt", "r")
print(a.read()) à Hello World
Hello Poor World
a.close()

l 串列讀取模式：
a = open(r"C:\Users\Timmy\Desktop\test.txt", "r")
print(a.readlines()) à ['Hello World\n', 'Hello Poor World']
a.close()

l 寫入模式與新增模式：
# 如果open()開啟的檔案不存在，寫入模式或新增模式都會新建空檔案
# 寫入模式會從頭開始覆蓋原有的檔案
a = open("test.txt", "w")
# write()方法不像print()函式會在字串尾端自動加入換行符號
a.write("Hello World\n")
a.close()
# 新增模式會在已有的檔案尾端新增文字
b = open("test.txt", "a")
b.write("Hello Poor World")
b.close()

l 利用pprint.pformat()函式來儲存變數：
import pprint
bag = [{"itemOne": "Life Orb", "itemTwo": "Old Amber"}, {"keyItemOne": "Coin Case", "keyItemTwo": "GB Sounds"}]
pprint.pformat(bag)
a = open("test.py", "w")
a.write("bag = " + pprint.pformat(bag) + "\n")
a.close()
# 儲存之test.py為一模組
import test
print(test.bag) à [{'itemOne': 'Life Orb', 'itemTwo': 'Old Amber'}, {'keyItemOne': 'Coin Case', 'keyItemTwo': 'GB Sounds'}]
print(test.bag[0]) à {'itemOne': 'Life Orb', 'itemTwo': 'Old Amber'}
print(test.bag[0]["itemOne"]) à Life Orb

使用Shelve模組來儲存變數

l 如果要從Python程式中儲存資料，那就要使用Shelve模組，在Windows系統中使用Shelve模組會在目前工作目錄中出現3個新檔案：test.bak、test.dat、test.dir。

l 新建、寫入及讀取檔案，Shelf值不必設定寫入模式或讀取模式，因為在開啟後是能寫入及讀取的：
import shelve
a = shelve.open("test")
a["items"] = ["Life Orb", "Dome Fossil", "Root Fossil", "Old Amber"]
print(type(a)) à <class 'shelve.DbfilenameShelf'>
print(a["items"]) à ['Life Orb', 'Dome Fossil', 'Root Fossil', 'Old Amber']
a.close()

l Shelf值有key()和value()方法可用，由於得到的值不是存入的原始資料，所以要使用list()函式以串列取得真正存入的原始資料：
import shelve
a = shelve.open("test")
a["items"] = ["Life Orb", "Dome Fossil", "Root Fossil", "Old Amber"]
print(list(a.keys())) à ['items']
print(list(a.values())) à [['Life Orb', 'Dome Fossil', 'Root Fossil', 'Old Amber']]
a.close()

四、檔案的組織管理

複製檔案與資料夾

l Shutil模組又稱為Shell工具，可複製、搬移、改名、刪除檔案，其方法第一個引數是source，第二個引數是destination，若source未指明路徑，預設目前工作目錄。

l 使用shutil.copy()方法複製檔案：
import shutil
# destination的資料夾必須存在，否則source的檔名改為destination的資料夾名稱且沒有副檔名
# destination的資料夾內已有相同檔名的檔案，會被蓋過去
print(shutil.copy("D:\\fileTest.txt", "D:\\dirTest"))
à D:\dirTest\fileTest.txt
# destination的資料夾必須存在，否則FileNotFoundError
# destination的資料夾內已有相同檔名的檔案，會被蓋過去
print(shutil.copy("fileTest.txt", "D:\\dirTest\\fileTest2.txt"))
à D:\dirTest\fileTest2.txt

l 使用shutil.copytree()方法複製整個資料夾，包含所有子資料夾和檔案：
import shutil
# destination的資料夾不可存在，否則FileExistsError
# 複製source的資料夾所有內容，不複製source的資料夾本身，destination的資料夾會自動產生
print(shutil.copytree("dirTest", "D:\\dirTest2")) à D:\dirTest2

搬移檔案與資料夾

l 使用shutil.move()方法搬移檔案：
import shutil
# destination的資料夾必須存在，否則source的檔名改為destination的資料夾名稱且沒有副檔名
# destination的資料夾內已有相同檔名的檔案，會出現shutil.Error
print(shutil.move("D:\\fileTest.txt", "D:\\dirTest"))
à D:\dirTest\fileTest.txt
# destination的資料夾必須存在，否則FileNotFoundError
# destination的資料夾內已有相同檔名的檔案，會被蓋過去
print(shutil.move("D:\\fileTest.txt", "D:\\dirTest\\fileTest2.txt"))
à D:\dirTest\fileTest2.txt

l 使用shutil.move()方法搬移資料夾：
import shutil
# destination的資料夾如果不存在
print(shutil.move("dirTest", "D:\\dirTest2")) à D:\dirTest2
# destination的資料夾如果存在
# destination的資料夾內已有相同檔名的資料夾，會出現shutil.Error
print(shutil.move("dirTest", "D:\\dirTest2")) à D:\dirTest2\dirTest

永久刪除檔案與資料夾

l 妥善利用#與print()，可在刪除檔案前先確認路徑是否正確。

l 使用os.unlink()方法永久刪除一個檔案：
import os
os.unlink("D:\\fileTest.txt")

l 使用os.rmdir()方法永久刪除一個空資料夾：
import os
os.rmdir("D:\\dirTest")

l 使用shutil.rmtree()方法永久刪除整個資料夾，包含所有子資料夾和檔案：
import shutil
shutil.rmtree("D:\\dirTest")

安全刪除檔案與資料夾

l 使用send2trash模組安全刪除檔案與資料夾，在命令提示字元輸入pip install send2trash，下載並安裝send2trash模組。

l import send2trash
send2trash.send2trash("fileTest.txt")

使用os.walk()走訪目錄樹

l   import os
# pathName走訪所有路徑，返回字串
# folderName走訪所有資料夾名稱，返回字串串列
# fileName走訪所有檔案名稱，返回字串串列
for pathName, folderName, fileName in os.walk("D:\\dirTest"):
    for aFolder in folderName:
        print("Folder of " + pathName + ": " + aFolder)
    for aFile in fileName:
        print("File of " + pathName + ": " + aFile)
    print("")

zipfile模組

l Zip檔案也稱為歸檔檔案(Archive File)，在Python中ZipFile物件代表整個壓縮檔，要建立ZipFile物件就要呼叫zipfile.ZipFile()函式，其中zipfile是模組名稱，ZipFile是函式名稱。

l 讀取Zip檔案：
import zipfile
a = zipfile.ZipFile("fileTest.zip")
# ZipFile物件的namelist()方法可返回Zip檔案內所有檔案和資料夾的字串串列
print(a.namelist())
# ZipFile物件的getinfo()方法返回Zip檔案內特定檔案的ZipInfo物件
b = a.getinfo("txtFileTest.txt")
# 返回壓縮前檔案位元數
print(b.file_size)
# 返回壓縮後檔案位元數
print(b.compress_size)
a.close()

l 解壓縮Zip檔案的所有檔案與資料夾：
import zipfile
a = zipfile.ZipFile("fileTest.zip")
a.extractall()
a.close()

l 解壓縮Zip檔案的所有檔案與資料夾，並放置在特定路徑：
import zipfile
a = zipfile.ZipFile("fileTest.zip")
a.extractall("D:\\")
a.close()

l 解壓縮Zip檔案的特定檔案：
import zipfile
a = zipfile.ZipFile("fileTest.zip")
a.extract("txtFileTest.txt")
a.close()

l 解壓縮Zip檔案的特定檔案，並放置在特定路徑：
import zipfile
a = zipfile.ZipFile("fileTest.zip")
# 特定路徑的資料夾不必存在，系統會自動建立
# 特定路徑的資料夾內已有相同檔名的檔案，會被蓋過去
a.extract("txtFileTest.txt", "D:\\")
a.close()

l 建立和新增Zip檔案：
import zipfile
# 如同open()函式，"w"是寫入模式，"a"是新增模式
a = zipfile.ZipFile("fileTest.zip", "w")
# write()方法的第一個引數是要新增的檔案或資料夾
# write()方法的第二個引數是壓縮類型參數
a.write("txtFileTest.txt", compress_type=zipfile.ZIP_DEFLATED)
a.close()

五、除錯(Debugging)

丟出例外raise Exception()

l 一般來說，呼叫函式的程式要知道怎麼處理例外，而不是函式自己要處理，因此raise陳述句大都在函式內，而try和except陳述句則在呼叫該函式的程式中。

l raise Exception("Error Message") à
Traceback (most recent call last):
File "C:\Users\Timmy\Desktop\test.py", line 1, in <module>
raise Exception("Error Message")
Exception: Error Message

l   def boxPrint(symbol, width, height):
    if len(symbol) != 1:
        raise Exception("Symbol must be a single character.")
    if width <= 2:
        raise Exception("Width must be greater than 2.")
    if height <= 2:
        raise Exception("Height must be greater than 2.")
    print(symbol * width)
    for i in range(height - 2):
        print(symbol + (" " * (width - 2)) + symbol)
    print(symbol * width)
def checkInt(inputValue):
    while not (type(inputValue) == int):
        inputValue = input("Please input an integer:")
        try:
            inputValue = int(inputValue)
            return inputValue
        except:
            continue
sym = input("Please input box symbol:")
print("Please input box width:")
wid = checkInt("default")
print("Please input box height:")
hei = checkInt("default")
try:
    boxPrint(sym, wid, hei)
except Exception as err:
    print("An error happened: " + str(err))

取得Traceback的字串

l   import traceback
try:
    raise Exception("Error Message")
except:
    a = open("Error_Info.txt", "w")
    a.write(traceback.format_exc())
    a.close()
    print("The Traceback Info Was Written to Error_Info.txt")
à The Traceback Info Was Written to Error_Info.txt

斷言(Assertion)

l assert陳述句針對的是程式設計者所犯的錯誤，是個健全性的檢查，如果檢查失敗就會丟出例外，至於那些可恢復的錯誤，丟出例外的處理方式會比較好。

l # assert求值為True或False的表示式, "條件為False時要顯示的字串"
store = "Close"
assert store == "Open", "The Store Should Be \"Open\"." à
Traceback (most recent call last):
File "C:\Users\Timmy\Desktop\test.py", line 2, in <module>
assert store == "Open", "The Store Should Be \"Open\"."
AssertionError: The Store Should Be "Open".

l 斷言是在開發時使用的，不需用在終端成品，若要停用斷言，在命令提示字元執行Python時，傳入-O作為選項參數，就可以停用：
python -O test.py

日誌(Logging)

l 啟用日誌：
使用logging模組，LogRecord物件會存放日誌訊息，將以下程式碼放到程式的頂端，#!行的下面，日誌訊息會被記錄到檔案：
import logging
logging.basicConfig(filename = "Logging.txt", level = logging.DEBUG, format = "%(asctime)s-%(levelname)s-%(message)s")

l 日誌層級(Logging Level)：
# logging.DEBUG可以顯示全部層級，以此類推logging.CRITICAL只顯示最高層級
import logging
logging.basicConfig(level = logging.DEBUG, format = "%(asctime)s-%(levelname)s-%(message)s")
logging.debug("Level 1") à 2018-01-27 09:03:59,502-DEBUG-Level 1
logging.info("Level 2") à 2018-01-27 09:03:59,502-INFO-Level 2
logging.warning("Level 3") à 2018-01-27 09:03:59,502-WARNING-Level 3
logging.error("Level 4") à 2018-01-27 09:03:59,502-ERROR-Level 4
logging.critical("Level 5") à 2018-01-27 09:03:59,502-CRITICAL-Level 5

l 停用日誌：
只要輸入一次logging.disable(logging.CRITICAL)，就能停用在這指令之後的日誌訊息，相當方便，因此建議不要使用print()來除錯。
logging.CRITICAL可以停用全部層級，以此類推logging.DEBUG只停用最低層級。

l   使用日誌範例：
import logging
logging.basicConfig(filename = "Logging.txt", level = logging.DEBUG, format = "%(asctime)s-%(levelname)s-%(message)s")
# logging.disable(logging.CRITICAL)
logging.info("Start of Program")
def factorial(n):
    logging.info("Start of Factorial %s"%(n))
    total = 1
    for i in range(1, n + 1):
        total *= i
        logging.debug("i = " + str(i) + ", total = " + str(total))
    logging.info("End of Factorial %s"%(n))
    return total
print(factorial(3))
logging.info("End of Program")

IDLE的除錯器(Debugger)

l 使用IDLE Debugger：
IDLE à Debug à Debugger

l 在Debug Control視窗，Stack、Source、Locals、Globals全部勾選，才會顯示全部的除錯訊息。

l 在Debug Control視窗5個按鈕的作用：
Go：正常執行程式到結束或中斷點(Breakpoint)。
Step：繼續執行下一行程式碼再暫停。
Over：繼續執行下一行程式碼再暫停，但會跨過函式，至函式返回後。
Out：從目前的函式正常執行至函式返回後。
Quit：馬上終止程式。

l 中斷點(Breakpoint)：
若要新增中斷點，在該行程式上按下滑鼠右鍵，選取Set Breakpoint指令即可；若要清除中斷點，在該行程式上按下滑鼠右鍵，選取Clear Breakpoint指令即可。

六Python自動化的樂趣－第3章至第10章－實作專題參考程式碼

第3章－實作專題參考程式碼

l   Collatz序列、輸入驗證：
def collatz(number):
    while number != 1:
        if number % 2 == 0:
            print(number // 2)
            number = number // 2
        elif number % 2 == 1:
            print(3 * number + 1)
            number = 3 * number + 1
    print("The End")
try:
    print("Enter a Number")
    collatz(int(input()))
except:
    print("Please Input an Integer")

第4章－實作專題參考程式碼

l 對程式碼加逗號：
a = ["Hello", "Poor", "World", "Bye", 123, 12.3]
for b in range(len(a) - 1):
print(a[b], end = ", ")
print("and " + str(a[-1]))
à Hello, Poor, World, Bye, 123, and 12.3

l   字元圖片網格：
a = [[0, 1, 1, 0, 0, 0],
    [1, 1, 1, 1, 0, 0],
    [1, 1, 1, 1, 1, 0],
    [0, 1, 1, 1, 1, 1],
    [1, 1, 1, 1, 1, 0],
    [1, 1, 1, 1, 0, 0],
    [0, 1, 1, 0, 0, 0],]
d = []
for e in range(len(a)):
    d.append(len(a[e]))
d.sort()
for c in range(d[-1]):
    for b in range(len(a)):
        print(a[b][c], end = "")
    print("")
à
0110110
1111111
1111111
0111110
0011100
0001000

第5章－實作專題參考程式碼

l   幻想遊戲的倉庫、把串列匯整到字典中：
def addToInventory(box, addedItems):
    for i in addedItems:
        if i in box:
            box[i] += 1
        else:
            box.setdefault(i, 1)
def displayInventory(box):
    print("Inventory:")
    count = 0
    for k, v in box.items():
        print(str(v) + " " + k)
        count += v
    print("Total number of items: " + str(count))
inv = {"rope": 1, "torch": 6, "gold coin": 42, "dagger": 1, "arrow": 12}
dragonLoot = ["gold coin", "dagger", "gold coin", "ruby"]
addToInventory(inv, dragonLoot)
displayInventory(inv)
à
Inventory:
1 rope
6 torch
44 gold coin
2 dagger
12 arrow
1 ruby
Total number of items: 66

第6章－實作專題參考程式碼

l   表格列印程式：
tableData = [["apples", "oranges", "cherries", "banana"],
    ["Alice", "Bob", "Carol", "David"],
    ["dogs", "cats", "moose", "goose"]]
def printTable(a):
    colWidths = []
    for layerOne in range(len(a)):
        for layerTwo in range(len(a[layerOne])):
            colWidths.append(len(a[layerOne][layerTwo]))
    colWidths.sort()
    d = []
    for e in range(len(a)):
        d.append(len(a[e]))
    d.sort()
    for c in range(d[-1]):
        for b in range(len(a)):
            print(a[b][c].rjust(colWidths[-1] + 4), end = "")
        print("")
printTable(tableData)
à
      apples       Alice        dogs
     oranges         Bob        cats
    cherries       Carol       moose
      banana       David       goose

第7章－實作專題參考程式碼

l   驗證強式密碼：
import re
aRegex = re.compile(r"(.){8,}")
bRegex = re.compile(r"[A-Z]+")
cRegex = re.compile(r"[a-z]+")
dRegex = re.compile(r"\d+")
password = input("Please Input Your New Password:")
if aRegex.search(password) == None:
    print("Not a Strong Password for Less Than 8 Characters")
elif bRegex.search(password) == None:
    print("Not a Strong Password for Lack of Uppercase Letter ")
elif cRegex.search(password) == None:
    print("Not a Strong Password for Lack of Lowercase Letter")
elif dRegex.search(password) == None:
    print("Not a Strong Password for Lack of Numeric Character")
else:
    print(password + " Is a Strong Password")

l   strip()的正規表示式版本：
deletedString = str(input("Deleted String:"))
replacedString = str(input("Replaced String:"))
import re
if deletedString != "":
    aRegex = re.compile(deletedString)
    aMatch = aRegex.sub("", replacedString)
    print(aMatch)
else:
    bRegex = re.compile(r"^\s*")
    bMatch = bRegex.sub("", replacedString)
    cRegex = re.compile(r"\s*$")
    cMatch = cRegex.sub("", bMatch)
    print(cMatch)

第8章－實作專題參考程式碼

l   擴充版的多重剪貼簿：
#! python3
# pyw副檔名是Python執行該程式時不會顯示終端視窗
# pythonScript.bat與pythonScript.pyw放在同目錄，bat檔文字如下：@pyw.exe "C:\Path\to\your\pythonScript.pyw" %*
import shelve, sys, pyperclip
locker = shelve.open("Locker")
if len(sys.argv) == 3:
    # save
    if sys.argv[1].lower() == "save":
        locker[sys.argv[2]] = pyperclip.paste()
    # delete
    elif sys.argv[1].lower() == "delete":
        del locker[sys.argv[2]]
    else:
        pass
elif len(sys.argv) == 2:
    # list
    if sys.argv[1].lower() == "list":
        pyperclip.copy(str(list(locker.keys())))
    # delete
    elif sys.argv[1].lower() == "delete":
        for key in list(locker.keys()):
            del locker[key]
    # load
    elif sys.argv[1] in locker:
        pyperclip.copy(locker[sys.argv[1]])
    else:
        pass
else:
    pass
locker.close()

l   Mad Libs填字遊戲：
# Mad Libs.txt與py檔放在同目錄，txt檔文字如下：The ADJECTIVE panda ADVERB walked to the NOUN and then VERB. A nearby NOUN was ADVERB unaffected by these events.
import os, re
madLibs = open(os.path.abspath(".") + "\\Mad Libs.txt", "r")
originalText = madLibs.read()
madLibs.close()
originalRegex = re.compile(r"ADJECTIVE|NOUN|ADVERB|VERB")
originalMatch = originalRegex.findall(originalText)
answer = []
for i in originalMatch:
    if i == "ADJECTIVE":
        answer.append(input("Enter an adjective:"))
    elif i == "NOUN":
        answer.append(input("Enter a noun:"))
    elif i == "ADVERB":
        answer.append(input("Enter an adverb:"))
    elif i == "VERB":
        answer.append(input("Enter a verb:"))
tempMatch = originalRegex.sub(r"%s", originalText)
answerText = tempMatch%tuple(answer)
print(answerText)
newMadLibs = open("New Mad Libs.txt", "w")
newMadLibs.write(answerText)
newMadLibs.close()

l   正規表示式的尋找：
import os, re
desigPath = r"C:\Users\Timmy\Desktop"
for filename in os.listdir(desigPath):
    if filename.endswith(".txt"):
        txtFile = open(desigPath + "\\" + filename)
        txtContent = txtFile.read()
        desigRegex = re.compile(r"\d{5,}")
        desigMatch = desigRegex.findall(txtContent)
        print(filename + " : " + str(desigMatch))
        txtFile.close()

第9章－實作專題參考程式碼

l   選擇性複製：
import os, shutil
def selCopy(selPath, newDirPath, fileType):
    num = 1
    while True:
        formalNewDirPath = newDirPath + " " + str(num)
        if not os.path.exists(formalNewDirPath):
            break
        num += 1
    os.makedirs(formalNewDirPath)
    for pathName, folderName, fileName in os.walk(selPath):
        for aFile in fileName:
            if aFile.endswith(fileType):
                shutil.copy(os.path.join(pathName, aFile), formalNewDirPath)
selCopy("D:\\Original Folder", "D:\\New Folder", ".pdf")

l   刪除不需要的檔案：
import os
def delHugeFile(selPath, fileSize):
    for pathName, folderName, fileName in os.walk(selPath):
        for aFile in fileName:
            if os.path.getsize(os.path.join(pathName, aFile)) > fileSize:
                print(os.path.join(pathName, aFile))
                # os.unlink(os.path.join(pathName, aFile))
delHugeFile("D:\\dirTest", 838860800)

l   刪除檔案中有問題的編號：
# spam001_ant.txt
# spam002_bird.docx
# spam003_cat.txt
# spam005_dog.txt改為spam004_dog.txt
# spam006_egg.docx改為spam005_egg.docx
# spam007.txt改為spam006.txt
# spam007_g.txt
# spam008_h.txt改為spam009_h.txt(保留8)
# spam009_i.txt改為spam010_i.txt
import os, re, shutil
def reviseFileName(selFolder):
    fileList = os.listdir(selFolder)
    num = 1
    oldRegex = re.compile(r"""(
    (spam)     #spam
    (\d{3})    #digits
    (.*)$      #others
    )""", re.VERBOSE)
    for oldName in fileList:
        numStr = str(num)
        ruleName = "spam" + numStr.rjust(3, "0")
        if not oldName.startswith(ruleName):
            oldMatch = oldRegex.search(oldName)
            newName = ruleName + oldMatch.group(4)
            print("Old File Name: " + oldName)
            print("New File Name: " + newName)
            # shutil.move(os.path.join(selFolder, oldName), os.path.join(selFolder, newName))
        num += 1
        if num == 8:
            num += 1
reviseFileName("D:\\dirTest")

第10章－實作專題參考程式碼

l   對硬幣拋擲程式進行除錯：
import random
def inputGuess():
    guess = ""
    while guess not in ("heads", "tails"):
        guess = input("Guess the coin toss! Enter heads or tails:")
    return guess
def autoToss():
    toss = random.randint(0, 1)
    if toss == 0:
        toss = "tails"
    else:
        toss = "heads"
    return toss
guess = inputGuess()
toss = autoToss()
if toss == guess:
    print("You got it!")
else:
    print("Nope! Guess again!")
    guess = inputGuess()
    toss = autoToss()
    if toss == guess:
        print("You got it!")
    else:
        print("Nope. You are really bad at this game.")

Timmy's Column

2018年4月1日星期日

Python自動化的樂趣學習筆記 (1)

沒有留言:

張貼留言

2018年4月1日 星期日

Python自動化的樂趣學習筆記 (1)

沒有留言:

張貼留言

2018年4月1日星期日