Phil 12.2.2022

Core dimensions of human material perception

  • Visually categorizing and comparing materials is crucial for our everyday behaviour. Given the dramatic variability in their visual appearance and functional significance, what organizational principles underly the internal representation of materials? To address this question, here we use a large-scale data-driven approach to uncover the core latent dimensions in our mental representation of materials. In a first step, we assembled a new image dataset (STUFF dataset) consisting of 600 photographs of 200 systematically sampled material classes. Next, we used these images to crowdsource 1.87 million triplet similarity judgments. Based on the responses, we then modelled the assumed cognitive process underlying these choices by quantifying each image as a sparse, non-negative vector in a multidimensional embedding space. The resulting embedding predicted material similarity judgments in an independent test set close to the human noise ceiling and accurately reconstructed the similarity matrix of all 600 images in the STUFF dataset. We found that representations of individual material images were captured by a combination of 36 material dimensions that were highly reproducible and interpretable, comprising perceptual (e.g., “grainy”, “blue”) as well as conceptual (e.g., “mineral”, “viscous”) dimensions. These results have broad implications for understanding material perception, its natural dimensions, and our ability to organize materials into classes.

Need to read LM steganography paper


  • Did my self-assessment yesterday. So meaningless.
  • Finish the Q3 report today. I think there should be no problems


  • Financial paperwork uploads!

GPT Agents

  • Get some anti-Semitic terms from the GPT and add them to the pull list to evaluate for min/max/average (this part goes in the writeup!)
  • Start the pull
  • OpenAI’s GPT-3 has upgraded their Davinci model to 003. It is putting out some really good stuff! For example, I asked the chatbot to “write some python code to visualize a Sierpenski Gasket”. A reasonably arcane request. Here’s what it wrote:
import matplotlib.pyplot as plt

def sierpinski(points, degree):
    # Base case: draw the triangle
    if degree == 0:
        triangle = plt.Polygon(points, fill='black')
        # Recursive case: divide the triangle into three smaller triangles and
        # recursively draw each of them
        sierpinski([points[0], midpoint(points[0], points[1]), midpoint(points[0], points[2])], degree - 1)
        sierpinski([points[1], midpoint(points[0], points[1]), midpoint(points[1], points[2])], degree - 1)
        sierpinski([points[2], midpoint(points[2], points[1]), midpoint(points[0], points[2])], degree - 1)

def midpoint(p1, p2):
    return [(p1[0] + p2[0]) / 2, (p1[1] + p2[1]) / 2]

# Set up the plot

# Draw the Sierpinski gasket
points = [[-1, -1/2**0.5], [0, 1/2**0.5], [1, -1/2**0.5]]
sierpinski(points, 5)

# Center the gasket in the plot
plt.xlim(-1.5, 1.5)
plt.ylim(-1.5, 1.5)

# Show the plot

  • And this is what the code generates:
  • Pretty impressive
  • YugabyteDB is a high-performance distributed massive SQL Database. It is the only relational open source database that can scale at a global level providing ACID semantics for transactions. It provides low latency, massive scale, high available data store. It has a cloud offering and enterprise edition.
  • Started talking to the Mastodon API. It is extremely straightforward:
import json
import requests

# A playground for exploring the Mastodon REST interface (

def create_timeline_url(instance:str = "", limit:int=10):
    url = "https://{}/api/v1/timelines/public?limit={}".format(instance, limit)
    print("create_timeline_url(): {}".format(url))
    return url

def connect_to_endpoint(url) -> json:
    response = requests.request("GET", url)
    print("Status code = : {}".format(response.status_code))
    if response.status_code != 200:
        raise Exception(
            "Request returned an error: {} {}".format(
                response.status_code, response.text
    return response.json()

def print_response(title:str, j:json):
    json_str = json.dumps(j, indent=4, sort_keys=True)
    print("\n------------ Begin '{}':\nresponse:\n{}\n------------ End '{}'\n".format(title, json_str, title))

def main():
    instance_list = ["", ""]
    for instance in instance_list:
        url = create_timeline_url(instance, 1)
        rsp = connect_to_endpoint(url)
        print_response("{} test:".format(instance), rsp)

if __name__ == "__main__":

  • It’s based on collections in the ActivityPub protocol, which is a decentralized social networking protocol based upon the [ActivityStreams] 2.0 data format. It provides a client to server API for creating, updating and deleting content, as well as a federated server to server API for delivering notifications and content.