“[Community Note] says this is AI. In this case, I don’t care. We should look out for our own (Americans before the rest of the world and I wouldn’t be at all surprised if there was a girl fitting the description that wasn’t lucky enough to make it to a photographer for such an image,” wrote yet another.
Facebook itself is paying creators in India, Vietnam, and the Philippines for bizarre AI spam that they are learning to make from YouTube influencers and guides sold on Telegram.
Large model inference is shifting from cloud to edge due to concerns about the privacy of user interaction data. However, edge devices often struggle with limited computing power, memory, and bandwidth, requiring collaboration across multiple devices to run and speed up LLM inference. Pipeline parallelism, the mainstream solution, is inefficient for single-user scenarios, while tensor parallelism struggles with frequent communications. In this paper, we argue that tensor parallelism can be more effective than pipeline on low-resource devices, and present a compute- and memory-efficient tensor parallel inference system, named TPI-LLM, to serve 70B-scale models. TPI-LLM keeps sensitive raw data local in the users’ devices and introduces a sliding window memory scheduler to dynamically manage layer weights during inference, with disk I/O latency overlapped with the computation and communication. This allows larger models to run smoothly on memory-limited devices. We analyze the communication bottleneck and find that link latency, not bandwidth, emerges as the main issue, so a star-based allreduce algorithm is implemented. Through extensive experiments on both emulated and real testbeds, TPI-LLM demonstrated over 80% less time-to-first-token and token latency compared to Accelerate, and over 90% compared to Transformers and Galaxy, while cutting the peak memory footprint of Llama 2-70B by 90%, requiring only 3.1 GB of memory for 70B-scale models.
9:30 meeting with Matt. Show the GPM demo and talk about coordination with expensive information, stiff & dense networks vs. slack and sparse, the need for embodiment to find truly novel things, the curse of dimensionality, explore/exploit and the need for diversity. done!
Of all such apps I have tried, the most ambitious is Replika. Unlike most of its competitors, it has a chat interface with elements that bring it close to life-simulation games like The Sims. You’re invited to name and design your bot, with various options for hairstyle and skin color, along with sliders that adjust the size of breasts or muscles. You’re then booted into a sort of bot purgatory: a white-walled waiting room, sparsely furnished, where the avatar paces like a prisoner, waiting for you to strike up conversation. Users are encouraged to customize the room with furniture and acquire new outfits using in-app currency. This can be bought with real money or earned by completing ‘quests’ such as talking to your bot every day or sharing photos. It’s a feedback loop that encourages constant engagement and self-disclosure, rewarding users with the power of customization, so that the bot feels made for you and you alone.
SBIRs
Honestly, everything here is support other than BD Opportunity research. I do think I’ll dig into that after lunch since there should be new BAAs now?
This looks very good, if a bit dated. Deepset/Haystack appear to have continued development. So check out the website first. Build a Search Engine with GPT-3
Semantic search engines — our specialty here at deepset — are often powered by extractive question answering models. These models return snippets from the knowledge base verbatim, rather than generating text from scratch the way ChatGPT does. However, many applications can benefit from the abilities of generative LLMs. That’s why Haystack, deepset’s open-source framework for applied natural language processing (NLP), allows you to leverage multiple GPT models in your pipeline. With this approach, you can build a GPT-powered semantic search engine that uses your own data as ground truth and bases its natural-language answers on the information it contains.
SBIRs
Maybe set up the trade show demo project? Nope, but soon, probably
Grants
Submit review of proposal 10. Done. And 12!
GPT Agents
Work on challenges section. Did review 12 instead. I’ll work on this tomorrow morning
The prevailing methods to make large language models more powerful and amenable have been based on continuous scaling up (that is, increasing their size, data volume and computational resources1) and bespoke shaping up (including post-filtering2,3, fine tuning or use of human feedback4,5). However, larger and more instructable large language models may have become less reliable. By studying the relationship between difficulty concordance, task avoidance and prompting stability of several language model families, here we show that easy instances for human participants are also easy for the models, but scaled-up, shaped-up models do not secure areas of low difficulty in which either the model does not err or human supervision can spot the errors. We also find that early models often avoid user questions but scaled-up, shaped-up models tend to give an apparently sensible yet wrong answer much more often, including errors on difficult questions that human supervisors frequently overlook. Moreover, we observe that stability to different natural phrasings of the same question is improved by scaling-up and shaping-up interventions, but pockets of variability persist across difficulty levels. These findings highlight the need for a fundamental shift in the design and development of general-purpose artificial intelligence, particularly in high-stakes areas for which a predictable distribution of errors is paramount.
SBIRs
10:30 LM followup. Moved the WP to LaTeX. Not sure about next steps
2:30 MDA meeting
Grants
Finish proposal 10 – done! I think I’ll submit this one and see how it fits in the EasyChair format before doing the next one.
Good ride yesterday. Managed to eke out a 20mph average, but it was hard for the last 30 miles or so. Here’s the power curve:
You can see that there is a nice 200-ish watt output for 2 hours, then another hour+ at a bit below that, and then the last hour at a much lower output. That is exactly the way that I felt. And it’s just like most of my rides this year with those two drops, which is really interesting. Just a bit more extreme, particularly the second drop. Part of that was warding off cramping, which is the problem I’ve been dealing with for the last couple of years, but most of it was that I really didn’t have that much left in my legs and was pretty much limping home.
Grants
Put together a review template and started to fill it in. I think that’s enough for a Sunday
Representation learning, and interpreting learned representations, are key areas of focus in machine learning and neuroscience. Both fields generally use representations as a means to understand or improve a system’s computations. In this work, however, we explore surprising dissociations between representation and computation that may pose challenges for such efforts. We create datasets in which we attempt to match the computational role that different features play, while manipulating other properties of the features or the data. We train various deep learning architectures to compute these multiple abstract features about their inputs. We find that their learned feature representations are systematically biased towards representing some features more strongly than others, depending upon extraneous properties such as feature complexity, the order in which features are learned, and the distribution of features over the inputs. For example, features that are simpler to compute or learned first tend to be represented more strongly and densely than features that are more complex or learned later, even if all features are learned equally well. We also explore how these biases are affected by architectures, optimizers, and training regimes (e.g., in transformers, features decoded earlier in the output sequence also tend to be represented more strongly). Our results help to characterize the inductive biases of gradient-based representation learning. We then illustrate the downstream effects of these biases on various commonly-used methods for analyzing or intervening on representations. These results highlight a key challenge for interpretability—or for comparing the representations of models and brains—disentangling extraneous biases from the computationally important aspects of a system’s internal representations.
More AI slop:
Amazing to watch Google destroy its core functionality chasing AI. Friends on the groupchat were talking about Rickey Henderson, who threw left and hit from the right side, which is really rare. If you go to google to find other throw left/bat right players. This is what its AI gives you.
From https://bsky.app/profile/chrislhayes.bsky.social/post/3l55tbzk5ue2e. He continues: “This is is garbage! It’s worst than useless, it’s misleading! If you looked at it quickly you’d think Babe Ruth and Shohei also both threw left and batted right. Sure this is trivial stuff but the whole point is finding accurate information.“
Really good example of potential sources of AI pollution as it applies to research. Vetting sources may become progressively harder as the AI is able to interpolate across more data. More detail is in the screenshot:
The alt text describes “Screenshot of a Bridgeman Images page, which shows what appears to be a 19th-century photograph of a man in a top hat, but which has appended metadata for a completely different work of art, namely, a late medieval manuscript, which features an image of the author Pierre Bersuire presenting his book to the King of France.“
The underground exploitation of large language models (LLMs) for malicious services (i.e., Malla) is witnessing an uptick, amplifying the cyber threat landscape and posing questions about the trustworthiness of LLM technologies. However, there has been little effort to understand this new cybercrime, in terms of its magnitude, impact, and techniques. In this paper, we conduct the first systematic study on 212 real-world Mallas, uncovering their proliferation in underground marketplaces and exposing their operational modalities. Our study discloses the Malla ecosystem, revealing its significant growth and impact on today’s public LLM services. Through examining 212 Mallas, we uncovered eight backend LLMs used by Mallas, along with 182 prompts that circumvent the protective measures of public LLM APIs. We further demystify the tactics employed by Mallas, including the abuse of uncensored LLMs and the exploitation of public LLM APIs through jailbreak prompts. Our findings enable a better understanding of the real-world exploitation of LLMs by cybercriminals, offering insights into strategies to counteract this cybercrime.
In the ever-evolving realm of cybersecurity, the rise of generative AI models like ChatGPT, FraudGPT, and WormGPT has introduced both innovative solutions and unprecedented challenges. This research delves into the multifaceted applications of generative AI in social engineering attacks, offering insights into the evolving threat landscape using the blog mining technique. Generative AI models have revolutionized the field of cyberattacks, empowering malicious actors to craft convincing and personalized phishing lures, manipulate public opinion through deepfakes, and exploit human cognitive biases. These models, ChatGPT, FraudGPT, and WormGPT, have augmented existing threats and ushered in new dimensions of risk. From phishing campaigns that mimic trusted organizations to deepfake technology impersonating authoritative figures, we explore how generative AI amplifies the arsenal of cybercriminals. Furthermore, we shed light on the vulnerabilities that AI-driven social engineering exploits, including psychological manipulation, targeted phishing, and the crisis of authenticity. To counter these threats, we outline a range of strategies, including traditional security measures, AI-powered security solutions, and collaborative approaches in cybersecurity. We emphasize the importance of staying vigilant, fostering awareness, and strengthening regulations in the battle against AI-enhanced social engineering attacks. In an environment characterized by the rapid evolution of AI models and a lack of training data, defending against generative AI threats requires constant adaptation and the collective efforts of individuals, organizations, and governments. This research seeks to provide a comprehensive understanding of the dynamic interplay between generative AI and social engineering attacks, equipping stakeholders with the knowledge to navigate this intricate cybersecurity landscape.
Realized that I may be able to generate a lot of trajectories really quick by having a base trajectory and an appropriate envelope to contain whatever function (sin wave, random walk, etc) I want to overlay. Train that first, and then have a second model that uses the first to calculate the likely interception point. Kind of what like google does (did?) with predictive search modelling
I just read this, and all I can think of is that this is exactly the slow speed attack that AI would be so good at. It’s a really simple playbook. Run various versions against all upcoming politicians. Set up bank accounts in their names that pay for all this and work on connecting their real accounts. All it takes is vast, inhuman patience: https://www.politico.com/news/2024/09/23/mark-robinson-porn-sites-00180545
SBIRs
1:00 Tradeshow demo meeting – done. Looks like I will be doing the development of the back end
I’d like to write an essay that compares John Wick to Field of Dreams as an example of what AI can reasonably be expected to be able to produce and what it will probably always struggle with.
This weekend is the last push to move everything into the garage for the basement finishing
Dumpster – good progress! I think I’ll finish tomorrow
Move shelving and storage (don’t forget that the trailer can also be used for longer-term storage
Bike stuff
Grants
Finished reading proposal 10. Since I have notes, I’m going to read the other two first and get a sense of what these grants look like before writing the reviews
Dr Newman received the award for research that revealed fundamental flaws in extreme old-age demographic research, by demonstrating that data patterns are likely to be dominated by errors and finding that supercentenarian and remarkable age records exhibit patterns indicative of clerical errors and pension fraud(paper pre-print, not yet peer-reviewed).
Chores
Recycling run – done!
Goodwill run – done!
11:30 Lunch with Greg – done!
Trim grass – done
SBIRs
2:00 Meeting – done
Grants
Continue reading proposal 10. Everything is due Oct 9. About 3/4 through
“AI” often means artificial intentionality: trying to trick others into thinking that deliberate effort was invested in some specific something. That attention that was never invested is instead extracted from the consumer— a burden placed on them
SBIRs
9:00 standup
10:30–11:30 Virtual Event | The Cyber Landscape in the Indo-Pacific | Center for a New American Security. It was interesting – The big players (e.g. Microsoft) are still treating hostile information operations as a form of cybercrime, which tends to be slow, and may not be up to warfare-level engagements. It’s kind of like treating war as the crime of mass murder. Which it sort of is, but working to bring your enemy to trial only happens after the shooting stops, usually.
Meeting with Aaron to discuss white paper. Probably Monday
4:30 Book Club
GPT Agents
Finish background! Done! Reorganized things too.
2:45 LLM meeting – did a lot of editing. We should have a first draft by next week
Grants
Continue reading proposal 10. Everything is due Oct 9. About halfway through
Craigslist founder Craig Newmark believes hacking by foreign governments is a major risk to the U.S. and plans to donate $100 million to bolster the country’s cybersecurity.
You must be logged in to post a comment.