Phil 8.25.2022


  • Was at the Tech Summit for the last two days. Good to see people again!
    • Pinged Jennifer about Elicit
  • Trip
    • Tix! – done
    • Hotel! – done
    • Car! – done
    • Slides! – done, but not printed
  • Continue on Quarterly report
    • 9:00 Meeting with Ron


  • Respond to Katy’s letter – figuring out who would be good to send this to for review
    • Ping Brenda. I think we’ll need to meet next week – done
    • Wound up writing a short python program to scan through my book to find what’s cited most. Mostly based on this. Handy!
from tkinter import filedialog
import PyPDF2
import re
from typing import List, Dict

filename = filedialog.askopenfilename(filetypes=(("pdf files", "*.pdf"),), title="Load pdf File")
if filename:
    print("opening {}".format(filename))
    # open the pdf file
    object = PyPDF2.PdfFileReader(filename)
    d = {}

    # get number of pages
    NumPages = object.getNumPages()
    print("There are {} pages".format(NumPages))
    # extract text and do the search
    for i in range(0, NumPages):
        PageObj = object.getPage(i)
        # print("this is page " + str(i))
        Text = PageObj.extractText()
        # print(Text)
        reml:List = re.findall("\[\d+\]", Text)
        if len(reml) > 0:
            for r in reml:
                if r in d:
                    d[r] += 1
                    d[r] = 1
    ds = dict(sorted(d.items(), key = lambda x: x[1], reverse=True))
    for k, v in ds.items():
        print("{} = {}".format(k, v))


  • Is today the day to try topic2vec? Sure hope so!
  • Started poking around. It’s hanging on the import. That is really odd