“It’s basically a handful of open-source projects duct-taped together. I started poking around and found some vulnerabilities relatively quickly. At the start it was mostly just curiosity but I decided to contact you once I saw what was in the database.”
Jim Donnies! Done
SBIRs
11:00 MP+SimAccel proposal meeting – went well, I think. Very different approaches. We’re new, bespoke, and they are legacy
1:30 LM MP tagup
Work on book – working!
GPT Agents
Ping everyone to say I’ve finished my pass through the paper
Officials have sought to tamp down the misinformation that has continued to spread online. The Federal Emergency Management Agency has been updating a webpage seeking to dispute common rumors, while the North Carolina Department of Public Safety has done the same, writing that authorities were “working around-the-clock to save lives and provide humanitarian relief.”
The ads also use an interesting blend of AI-generated cover images and real images within the slideshow itself. And not all of the ads are about North Korea. Some of them use AI-generated images of Taylor Swift and Jennifer Aniston to shill the supplements, while other slideshows are spreading disinformation about Mpox, are about TikTok trends like “demure,” or claim the supplements are “better than Ozempic.”
3:00 Demo kickoff meeting – mostly figuring out what resources (compute, screens, etc) will be needed
GPT Agents
Work on paper. Wrote the script to convert footnotes to citations. It works well! Had a few issues getting raw strings to behave:
from tkinter import filedialog
import re
from typing import List, Dict
def load_file_to_list(filename:str) -> List:
print("opening {}".format(filename))
try:
with open(filename, 'r') as file:
lines = file.readlines()
return [line.strip() for line in lines]
except FileNotFoundError:
print("Error: File '{}' not found".format(filename))
return []
def save_list_to_file(l:List, filename:str):
print("opening {}".format(filename))
s:str
try:
with open(filename, 'w') as file:
for s in l:
file.write("{}\n".format(s))
except FileNotFoundError:
print("Error: File '{}' not found".format(filename))
return []
filename = filedialog.askopenfilename(filetypes=(("tex files", "*.tex"),), title="Load tex File")
if filename:
filename2 = filename.replace(".tex", "_mod.tex")
filename3 = filename.replace(".tex", "_mod.bib")
# open the pdf file
l:List = load_file_to_list(filename)
p1 = re.compile(r"\\footnote{\\url{(.*?)}}")
p2 = re.compile(r"https://([^/]*)")
s1:str
s2:str
s3:str
l2 = []
cite_dict = {}
count = 1
for s1 in l: # Get each line in the file
#print(s1)
m1 = p1.findall(s1) # find all the footnote urls
for s2 in m1:
#print("\t{}".format(s2))
m2 = p2.match(s2) # pull out what we'll use for our cite
s3 = m2.group(1).strip('www.')
s3 = "{}_{}".format(s3, count)
#print("\t\t{}".format(s3))
olds = r"\footnote{\url{"+s2+"}}"
news = r"\cite{"+s3+"}"
#print("olds = {}[{}], news = {}".format(olds, s1.find(olds), news))
s1 = s1.replace(olds, news)
cite_dict[s3] = s2
l2.append(s1)
print(s1)
save_list_to_file(l2, filename2) # write the modified text to a new file
l2 = []
for key, val in cite_dict.items():
s = "@misc{"+key+",\n"
s += '\tauthor = "{Last, First}",\n'
s += '\tyear = "2024",\n'
s += '\thowpublished = "\\url{'+val+'}",\n'
s += 'note = "[Online; accessed 07-October-2024]"\n}\n'
print(s)
l2.append(s)
save_list_to_file(l2, filename3) # write the citation text to a .bib file
A cyberattack tied to the Chinese government penetrated the networks of a swath of U.S. broadband providers, potentially accessing information from systems the federal government uses for court-authorized network wiretapping requests.
Here are my AI-weapons thoughts on this: 1) If you can plant a MitM LLM that works to make people want to legislate back doors for cybercrime, you could set up this kind of operation. 2) If these backdoors already exist, you can plant LLMs and cause further havoc, or adjust the behavior of your adversary in more subtle ways.
“[Community Note] says this is AI. In this case, I don’t care. We should look out for our own (Americans before the rest of the world and I wouldn’t be at all surprised if there was a girl fitting the description that wasn’t lucky enough to make it to a photographer for such an image,” wrote yet another.
Facebook itself is paying creators in India, Vietnam, and the Philippines for bizarre AI spam that they are learning to make from YouTube influencers and guides sold on Telegram.
Large model inference is shifting from cloud to edge due to concerns about the privacy of user interaction data. However, edge devices often struggle with limited computing power, memory, and bandwidth, requiring collaboration across multiple devices to run and speed up LLM inference. Pipeline parallelism, the mainstream solution, is inefficient for single-user scenarios, while tensor parallelism struggles with frequent communications. In this paper, we argue that tensor parallelism can be more effective than pipeline on low-resource devices, and present a compute- and memory-efficient tensor parallel inference system, named TPI-LLM, to serve 70B-scale models. TPI-LLM keeps sensitive raw data local in the users’ devices and introduces a sliding window memory scheduler to dynamically manage layer weights during inference, with disk I/O latency overlapped with the computation and communication. This allows larger models to run smoothly on memory-limited devices. We analyze the communication bottleneck and find that link latency, not bandwidth, emerges as the main issue, so a star-based allreduce algorithm is implemented. Through extensive experiments on both emulated and real testbeds, TPI-LLM demonstrated over 80% less time-to-first-token and token latency compared to Accelerate, and over 90% compared to Transformers and Galaxy, while cutting the peak memory footprint of Llama 2-70B by 90%, requiring only 3.1 GB of memory for 70B-scale models.
9:30 meeting with Matt. Show the GPM demo and talk about coordination with expensive information, stiff & dense networks vs. slack and sparse, the need for embodiment to find truly novel things, the curse of dimensionality, explore/exploit and the need for diversity. done!
Of all such apps I have tried, the most ambitious is Replika. Unlike most of its competitors, it has a chat interface with elements that bring it close to life-simulation games like The Sims. You’re invited to name and design your bot, with various options for hairstyle and skin color, along with sliders that adjust the size of breasts or muscles. You’re then booted into a sort of bot purgatory: a white-walled waiting room, sparsely furnished, where the avatar paces like a prisoner, waiting for you to strike up conversation. Users are encouraged to customize the room with furniture and acquire new outfits using in-app currency. This can be bought with real money or earned by completing ‘quests’ such as talking to your bot every day or sharing photos. It’s a feedback loop that encourages constant engagement and self-disclosure, rewarding users with the power of customization, so that the bot feels made for you and you alone.
SBIRs
Honestly, everything here is support other than BD Opportunity research. I do think I’ll dig into that after lunch since there should be new BAAs now?
This looks very good, if a bit dated. Deepset/Haystack appear to have continued development. So check out the website first. Build a Search Engine with GPT-3
Semantic search engines — our specialty here at deepset — are often powered by extractive question answering models. These models return snippets from the knowledge base verbatim, rather than generating text from scratch the way ChatGPT does. However, many applications can benefit from the abilities of generative LLMs. That’s why Haystack, deepset’s open-source framework for applied natural language processing (NLP), allows you to leverage multiple GPT models in your pipeline. With this approach, you can build a GPT-powered semantic search engine that uses your own data as ground truth and bases its natural-language answers on the information it contains.
SBIRs
Maybe set up the trade show demo project? Nope, but soon, probably
Grants
Submit review of proposal 10. Done. And 12!
GPT Agents
Work on challenges section. Did review 12 instead. I’ll work on this tomorrow morning
You must be logged in to post a comment.