Category Archives: proposal

Phil 10.30.18

7:00 – 3:30 ASRC PhD

  • Search as embodies in the “Ten Blue Links” meets the requirements of a Parrow “Normal Accident”
    • The search results are densely connected. That’s how PageRank works. Even latent connections matter.
    • The change in popularity of a page rapidly affects the rank. So the connections are stiff
    • The relationships of the returned links both to each other and to the broader information landscape in general is hidden.
    • An additional density and stiffness issue is that everyone uses Google, so there is a dense, stiff connection between the search engine and the population of users
  • Write up something about how
    • ML can make maps, which decrease the likelihood of IR contributing to normal accidents
    • AI can use these maps to understand the shape of human belief space, and where the positive regions and dangerous sinks are.
  • Two measures for maps are the concepts or Range and length. Range is the distance that a trajectory can be placed on the map and remain contiguous. Length is the total distance that a trajectory travels, independent of the map its placed on.
  • Write up the basic algorithm of ML to map production
    • Take a set of trajectories that are known to be in the same belief region (why JuryRoom is needed) as the input
    • Generate an N-dimensional coordinate frame that best preserves length over the greatest range.
    • What is used as the basis for the trajectory may matter. The range (at a minimum), can go from letters to high-level topics. I think any map reconstruction based on letters would be a tangle, with clumps around TH, ER, ON, and AN. At the other end, an all-encompassing meta-topic, like WORDS would be a single, accurate, but useless single point. So the map reconstruction will become possible somewhere between these two extremes.
  • The Nietzsche text is pretty good. In particular, check out the way the sentences form based on the seed  “s when one is being cursed.
    • the fact that the spirit of the spirit of the body and still the stands of the world
    • the fact that the last is a prostion of the conceal the investion, there is our grust
    • the fact them strongests! it is incoke when it is liuderan of human particiay
    • the fact that she could as eudop bkems to overcore and dogmofuld
    • In this case, the first 2-3 words are the same, and random, semi-structured text. That’s promising, since the compare would be on the seed plus the generated text.
  • Today, see how fast a “Shining” (All work and no play makes Jack a dull boy.) text can be learned and then try each keyword as a start. As we move through the sentence, the probability of the next words should change.
    • Generate the text set
    • Train the Nietzsche model on the new text. Done. Here are examples with one epoch and a batch size of 32, with a temperature of 1.0:
      ----- diversity: 0.2
      ----- Generating with seed: "es jack a 
      dull boy all work and no play"
      es jack a 
      dull boy all work and no play makes jack a dull boy all work and no play makes jack a dull boy all work and no play makes jack a dull boy all work and no play makes 
      
      ----- diversity: 0.5
      ----- Generating with seed: "es jack a 
      dull boy all work and no play"
      es jack a 
      dull boy all work and no play makes jack a dull boy all work and no play makes jack a dull boy all work and no play makes jack a dull boy all work and no play makes 
      
      ----- diversity: 1.0
      ----- Generating with seed: "es jack a 
      dull boy all work and no play"
      es jack a 
      dull boy all work and no play makes jack a dull boy anl wory and no play makes jand no play makes jack a dull boy all work and no play makes jack a 
      
      ----- diversity: 1.2
      ----- Generating with seed: "es jack a 
      dull boy all work and no play"
      es jack a 
      dull boy all work and no play makes jack a pull boy all work and no play makes jack andull boy all work and no play makes jack a dull work and no play makes jack andull

      Note that the errors start with a temperature of 1.0 or greater

    • Rewrite the last part of the code to generate text based on each word in the sentence.
      • So I tried that and got gobbledygook. The issues is that the prediction only works on waveform-sized chunks. To verify this, I created a seed from the input text, truncating it to maxlen (20 in this case):
        sentence = "all work and no play makes jack a dull boy"[:maxlen]

        That worked, but it means that the character-based approach isn’t going to work

        ----- temperature: 0.2
        ----- Generating with seed: [all work and no play]
        all work and no play makes jack a dull boy all work and no play makes jack a dull boy all work and no play makes 
        
        ----- temperature: 0.5
        ----- Generating with seed: [all work and no play]
        all work and no play makes jack a dull boy all work and no play makes jack a dull boy all work and no play makes 
        
        ----- temperature: 1.0
        ----- Generating with seed: [all work and no play]
        all work and no play makes jack a dull boy all work and no play makes jack a dull boy pllwwork wnd no play makes 
        
        ----- temperature: 1.2
        ----- Generating with seed: [all work and no play]
        all work and no play makes jack a dull boy all work and no play makes jack a dull boy all work and no play makes

         

    • Based on this result and the ensuing chat with Aaron, we’re going to revisit the whole LSTM with numbers and build out a process that will support words instead of characters.
  • Looking for CMAC models, I found Self Organizing Feature Maps at NeuPy.com:
  • Here’s How Much Bots Drive Conversation During News Events
    • Late last week, about 60 percent of the conversation was driven by likely bots. Over the weekend, even as the conversation about the caravan was overshadowed by more recent tragedies, bots were still driving nearly 40 percent of the caravan conversation on Twitter. That’s according to an assessment by Robhat Labs, a startup founded by two UC Berkeley students that builds tools to detect bots online. The team’s first product, a Chrome extension called BotCheck.me, allows users to see which accounts in their Twitter timelines are most likely bots. Now it’s launching a new tool aimed at news organizations called FactCheck.me, which allows journalists to see how much bot activity there is across an entire topic or hashtag

Phil 9.20.18

7:00 – 5:00 ASRC MKT

  • Submit pre-approval for school – done!
  • Call bank – done!
  • Tried to do stuff on the Lufthansa site but couldn’t log in
  • Read through the USPTO RFI and realized it was a good fit for the Research Browser. Sent the RB white paper to those in the decision loop.
  • Updated the JuryRoom white paper to include an appendix on self-governance and handling hate speech, etc.
  • Introducing Cloud Inference API: uncover insights from large scale, typed time-series data
    • Today, we’re announcing the Cloud Inference API to address this need. Cloud Inference API is a simple, highly efficient and scalable system that makes it easier for businesses and developers to quickly gather insights from typed time series datasets. It’s fully integrated with Google Cloud Storage and can handle datasets as large as tens of billions of event records. If you store any time series data in Cloud Storage, you can use the Cloud Inference API to begin generating predictions.
    • Thread by Jeff Dean
  • Realized that there are additional matrices that can post-multiply the Laplacian. That way we can break down the individual components that contribute to “stiffness”. The reason for this is that only identical oscillators will synchronize. Similarity is a type of implicit coordination
    • Leave the Master matrix [M]: as degree on the diagonal, with “1” for a connection, “0” for no connection
    • =Bandwidth matrix [B]: has a value (0, 1) for each connection
    • Alignment matrix [A]: calculates the direction cosine between each connected node. Completely aligned nodes get an edge value of 1.0
    • There can also be a Weight vector W: which contains the “mass” of the node. A high mass node will be more influential in the network.
  • Had a few thoughts about JuryRoom self governance. The major social networks seem to be a mess with respect to what rights users have, and what constitutes a violation of terms of service. The solutions seem pretty brittle (Radiolab podcast on facebook rule making). JuryRoom has built in a mechanism for deliberation. Can that be used to create an online legal framework for crowdsourcing the rules and the interpretation? Roughly, I think that this requires the following:
    • A constitution – a simple document that lays out how JuryRoom will be goverened.
    • A bill of rights. What are users entitled to?
    • The concept of petition, trial, binding decisions, and precedent.
    • Is there a concept of testifying under oath?
    • The addition of “evidence” attachments that can be linked to posts. This could be existing documents, commissioned expert opinion, etc.
    • A special location for the “legal decisions”. These will become the basis for the precedent in future deliberations. Links to these prior decisions are done as attachments? Or as something else?
    • Localization. Since what is acceptable (within the bounds of the constitution and the bill of rights) changes as a function of culture, there needs to be a way that groups can split off from the main group to construct and use their own legal history. Voting/membership may need to be a part of this.
      • What is visible to non-members?
      • What are the requirements to be a member?
      • How are legal decisions implemented in software?
      • What are the duties of a “citizen”?
  • More iConf paper
  • I wanted to make figures align on the bottom. Turns out that the way that you do this is to set top alignment [t] for each minipage. Here’s my example:
    \begin{figure}[h]
    	\centering
    	\begin{minipage}[t]{.5\textwidth}
    		\centering
    		\fbox{\includegraphics[width=20em]{Nomad-Flocking-Stampede2.png}}
    		\caption{\label{fig:N-F-S} Evolved systems}
    	\end{minipage}%
    	\begin{minipage}[t]{.5\textwidth}
    		\centering
    		\fbox{\includegraphics[width=20em]{Nomad-Stampede.png}}
    		\caption{\label{fig:Monolithic_complex_nomad} Designed systems}
    	\end{minipage}%
    \end{figure}

     

Phil 9.17.18

7:00 – ASRC MKT

  • Dan Ariely Professor of psychology and behavioral economics, Duke University (Scholar)
    • Controlling the Information Flow: Effects on Consumers’ Decision Making and Preferences
      • One of the main objectives facing marketers is to present consumers with information on which to base their decisions. In doing so, marketers have to select the type of information system they want to utilize in order to deliver the most appropriate information to their consumers. One of the most interesting and distinguishing dimensions of such information systems is the level of control the consumer has over the information system. The current work presents and tests a general model for understanding the advantages and disadvantages of information control on consumers’ decision quality, memory, knowledge, and confidence. The results show that controlling the information flow can help consumers better match their preferences, have better memory and knowledge about the domain they are examining, and be more confident in their judgments. However, it is also shown that controlling the information flow creates demands on processing resources and therefore under some circumstances can have detrimental effects on consumers’ ability to utilize information. The article concludes with a summary of the findings, discussion of their application for electronic commerce, and suggestions for future research avenues.
      • This may be a good example of work that relates to socio-cultural interfaces.
  • Democracy’s Wisdom: An Aristotelian Middle Way for Collective Judgment
    • Josiah Ober (Scholar)
    •  The Greeks had experts determine choices, and the public vote between the expert choices
    • A satisfactory model of decision-making in an epistemic democracy must respect democratic values, while advancing citizens’ interests, by taking account of relevant knowledge about the world. Analysis of passages in Aristotle and legislative process in classical Athens points to a “middle way” between independent-guess aggregation and deliberation: an epistemic approach to decision-making that offers a satisfactory model of collective judgment that is both time-sensitive and capable of setting agendas endogenously. By aggregating expertise across multiple domains, Relevant Expertise Aggregation (REA) enables a body of minimally competent voters to make superior choices among multiple options, on matters of common interest. REA differs from a standard Condorcet jury in combining deliberation with voting based on judgments about the reputations and arguments of domain-experts.
  • NESTA Center for Collective Intelligence Design
    • The Centre for Collective Intelligence Design will explore how human and machine intelligence can be combined to make the most of our collective knowledge and develop innovative and effective solutions to social challenges.
    • Call for ideas (JuryRoom!)
      • Nesta is offering grants of up to £20,000 for projects that generate new knowledge on how to advance collective intelligence (combining human and machine intelligence) to solve social problems.
  • Synchronize gdrive, subversion
  • Finish abstract review
  • Organize iConf paper into something more coherent
    • Created folder for lit review
  • Start putting together notes on At Home in the Universe?
  • Ping folks from SASO
    • Graph Laplacian paper
    • Cycling stuff
  • Fika?
  • Meeting with Wayne?

Phil 1.24.18

7:00 – 5:00 ASRC MKT

  • H1: Groups are defined by a common location, orientation, and velocity (LOV) through a navigable physical or cognitive space. The amount of group cohesion and identification is proportional to the amount of similarity along all three axis.
  • H2: Group Behavior emerges from mutual influence, based on awareness and trust. Mutual influence is facilitated by Dimension Reduction: The lower the number of dimensions, the easier it is to produce a group.
  • H3: Group behavior has three distinct patterns: Nomadic, Flocking and Stampeding. These behaviors are dictated by the level of trust and awareness between individuals having similar LOVs
    • H3a: The trustworthiness of the underlying information space can be inferred from the group behaviors through belief space. All agents  seek out fitness peaks (reward gradients) and avoids valleys (risk gradients) within the space. (Risk = negative heading alignment, increase speed. Reward = positive heading alignment, decrease speed.)
      • Nomadic emphasizes environmental gradients as an individual or small group of agents. This supports the broadest awareness of the belief space, though it may be difficult to infer fitness peaks. Gradient discovery is  less influences by additional social effects,
      • Flocking behavior results from environmentally constrained social gradient seeking. For example, distance attenuates social influence. If an agent finds a risk or reward, that information cascades through the population as a function of the environmental constraints. (Note: In-group and out group could be manifestations of pure social gradient creation.)
      • Stampede emphasizes social gradients. This becomes easier as groups become larger and a strong ‘social reality’ occurs. When social influence is dominant at the expense of environmental awareness, a runaway stampede can occur. The beliefs and associated information that underlie a stampede can be inferred to be untrustworthy.
  • H4: Individual trajectories through these spaces, when combined with large numbers of other individual trajectories produce maps which reflect the dimensions that define the groups in that space.
  • These conclusions can be derived though
  • Continuing with BIC
    • GroupIdentification
  • Fundamentals of Data Visualization
    • I’m very excited to announce my latest project, a book on data visualization. The working title is “Fundamentals of Data Visualization”. The book will be published with O’Reilly, and a preview is available here. The entire book is written in R Markdown, and the figures are made with ggplot2. The source for the book is available on github.
  • Sex differences in the use of social information emerge under conditions of risk
    • Social learning provides an effective route to gaining up-to-date information, particularly when information is costly to obtain asocially. Theoretical work predicts that the willingness to switch between using asocial and social sources of information will vary between individuals according to their risk tolerance. We tested the prediction that, where there are sex differences in risk tolerance, altering the variance of the payoffs of using asocial and social information differentially influences the probability of social information use by sex. In a computer-based task that involved building a virtual spaceship, men and women (N = 88) were given the option of using either asocial or social sources of information to improve their performance. When the asocial option was risky (i.e., the participant’s score could markedly increase or decrease) and the social option was safe (i.e., their score could slightly increase or remain the same), women, but not men, were more likely to use the social option than the asocial option. In all other conditions, both women and men preferentially used the asocial option to a similar degree. 
  • Thinking Fast and Slow on Networks: Co-evolution of Cognition and Cooperation in Structured Populations
    •  In line with past work in well-mixed populations, we find that selection favors either the intuitive defector (ID) strategy which never deliberates, or the dual-process cooperator (DC) strategy which intuitively cooperates but uses deliberation to switch to defection in Prisoner’s Dilemma games. We find that sparser networks (i.e. smaller average degree) facilitate the success of DC over ID, while also reducing the level of deliberation that DC agents engage in; and that these results generalize across different kinds of networks.
  • Joanna J Bryson 7:30 AM – 24 Jan 2018: This didn’t happen because humans are evil. It happens because intelligence is computation—an expensive physical process—and therefore limited. Thread very worth reading.
  • A bit more Angular
  • Compared the speed of execution for LSTM on my and Aaron’s boxes. His newer card is a bit faster than my TITAN
  • Most of the day was spent putting together the ppt for the ML/AI workshop on Monday

Phil 8.16.17

7:00 – 8:00 Research

  • Added takeaway thoughts to my C&C writeup.
  • Working out how to add capability to the sim for P&RCH paper. My thoughts from vacation:
    • The agents contribution is the heading and speed
    • The UI is what the agent’s can ‘see’
    • The IR is what is available to be seen
    • An additional part might be to add the ability to store data in the space. Then the behavior of the IR (e.g. empty areas) would b more apparent, as would the effects of UI (only certain data is visible, or maybe only nearby data is visible) Data could be a vector field in Hilbert space, and visualized as color.
  • Updated IntelliJ
  • Working out how to to have a voxel space for the agents to move through that can also be drawn. It’s any number of dimensions, but it has to project to 2D. In the case of the agents, I just choose the first two axis. Each agent has an array of statements that are assembled into a belief vector. The space can be an array of beliefs. Are these just constructed so that they fill a space according to a set of rules? Then the xDimensionName and yDimensionName axis would go from (0, 1), which would scale to stage size? IR would still be a matter of comparing the space to the agent’s vector. Hmm.
  • This looks really good from an information horizon perspective: The Role of the Information Environment in Partisan Voting
    • Voters are often highly dependent on partisanship to structure their preferences toward political candidates and policy proposals. What conditions enable partisan cues to “dominate” public opinion? Here I theorize that variation in voters’ reliance on partisanship results, in part, from the opportunities their environment provides to learn about politics. A conjoint experiment and an observational study of voting in congressional elections both support the expectation that more detailed information environments reduce the role of partisanship in candidate choice

9:00 – 5:00 BRI

  • Good lord, the BoA corporate card comes with SIX seperate documents to read.
  • Onward to Chapter Three and Spring database interaction
  • Well that’s pretty clean. I do like the JdbcTemplate behaviors. Not sure I like the way you specify the values passed to the query, but I can’t think of anything better if you have more than one argument:
    @Repository
    public class EmployeeDaoImpl implements EmployeeDao {
        @Autowired
        private DataSource dataSource;
    
        @Autowired
        private JdbcTemplate jdbcTemplate;
    
        private RowMapper<Employee> employeeRowMapper = new RowMapper<Employee>() {
            @Override
            public Employee mapRow(ResultSet rs, int i) throws SQLException {
                Employee employee = new EmployeeImpl();
                employee.setEmployeeAge(rs.getInt("Age"));
                employee.setEmployeeId(rs.getInt("ID"));
                employee.setEmployeeName(rs.getString("FirstName") + " " + rs.getString("LastName"));
                return employee;
            }
        };
    
        @Override
        public Employee getEmployeeById(int id) {
            Employee employee = null;
    
            employee = jdbcTemplate.queryForObject(
                    "select * from Employee where id = ?",
                    new Object[]{id},
                    employeeRowMapper
            );
            return employee;
        }
    
        public List<Employee> getAllEmployees() {
            List<Employee> eList = jdbcTemplate.query(
                    "select * from Employee",
                    employeeRowMapper
            );
            return eList;
        }
    }
  • Here’s the xml to wire the thing up:
    <context:component-scan base-package="org.springframework.chapter3.dao"/>
    <bean id="employeeDao" class="org.springframework.chapter3.dao.EmployeeDaoImpl"/>
    
    <bean id="dataSource"
          class="org.springframework.jdbc.datasource.DriverManagerDataSource">
        <property name="driverClassName" value="${jdbc.driverClassName}" />
        <property name="url" value="${jdbc.url}" />
        <property name="username" value="xxx"/>
        <property name="password" value="yyy"/>
    </bean>
    
    <bean id="jdbcTemplate" class="org.springframework.jdbc.core.JdbcTemplate">
        <property name="dataSource" ref="dataSource" />
    </bean>
    
    <context:property-placeholder location="jdbc.properties" />
  • And here’s the properties. Note that I had to disable SSL:
    jdbc.driverClassName=com.mysql.jdbc.Driver
    jdbc.url=jdbc:mysql://localhost:3306/sandbox?autoReconnect=true&useSSL=false

Phil7.25.16

7:00 – 4:00 VTX

  • Rollers
  • Reworking the lit review. Meeting set up with Wayne for tomorrow at 4:00.
  • Still thinking about modelling. I could use sets of strings that would define a CAs worldview and then compare individuals by edit distance.
    • Not sure how to handle weights, a number, or repetitions of the character?
    • Comparing a set of CAs using centrality could see what the most important items are in that (overall and sub) population. how close the individual CA conforms to that distribution is a measure of the ‘belonging’?
    • CAs could adjust their internal model. Big changes should be hard, little changes should be easy. Would the dropping of a low ranked individual item result in a big change in edit distance with a group that doesn’t have the item?
    • Working on infrastructure that builds, collects and maintains Factoids

Phil 7.22.16

7:00 – 1:00 VTX

  • More bubble modelling. Found a nice paper from a financial perspective that looks like a good source for similar models.
  • Split out the calculation and spreadsheet functions to support snapshots and debugging.
    • Set up the base class to be the control. Explorers only look outside their SD, while confirmers and avoiders stay within. Not sure how to tease out the difference between those. I think it will have something to do with the way they look for information, which is beyond the scope of this model for now. Also switched to a random distribution. Here’s an initial result. Much more work to follow

GP

  • I was riding and thinking about something I read on fivethirtyeight.comThis isn’t the most artful way to say it, but it’s like, where do you go when the only people who seem to agree with you on taxes hate black people?” It’s by Ben Howe, a redstate commentator. And it makes me think that rather than basing the sim on only one value, there should be a cluster. Confirmed could look for a match in the cluster while avoiders would clusters if they hit somethings that doesn’t match. And the distance from the value should matter. Adopting a very different concept should take more energy than a similar one. And this makes me think that the CAs have to have a bit more alife in them. They need to budget their energy with reference to their internal and external states.
  • And then mom died. Here’s the OPM web page that matters: https://www.opm.gov/retirement-services/my-annuity-and-benefits/life-events/death/report-of-death/

Phil 6.15.16

7:00 – 10:00, 12:00 – 4:00 VTX

  • Got the official word that I should be charging the project for research. Saved the email this time.
  • Continuing to work on the papers list
  • And in the process of looking at Daniele Quercia‘s work, I found Auralist: introducing serendipity into music recommendation which was cited by
    An investigation on the serendipity problem in recommender systems. Which has the following introduction:

    • In the book ‘‘The Filter Bubble: What the Internet Is Hiding from You’’, Eli Pariser argues that Internet is limiting our horizons (Parisier, 2011). He worries that personalized filters, such as Google search or Facebook delivery of news from our friends, create individual universes of information for each of us, in which we are fed only with information we are familiar with and that confirms our beliefs. These filters are opaque, that is to say, we do not know what is being hidden from us, and may be dangerous because they threaten to deprive us from serendipitous encounters that spark creativity, innovation, and the democratic exchange of ideas. Similar observations have been previously made by Gori and Witten (2005) and extensively developed in their book ‘‘Web Dragons, Inside the Myths of Search Engine Technology’’ (Witten, Gori, & Numerico, 2006), where the metaphor of search engines as modern dragons or gatekeepers of a treasure is justified by the fact that ‘‘the immense treasure they guard is society’s repository of knowledge’’ and all of us accept dragons as mediators when having access to that treasure. But most of us do not know how those dragons work, and all of us (probably the search engines’ creators, either) are not able to explain the reason why a specific web page ranked first when we issued a query. This gives rise to the so called bubble of Web visibility, where people who want to promote visibility of a Web site fight against heuristics adopted by most popular search engines, whose details and biases are closely guarded trade secrets.
    • Added both papers to the corpus. Need to read and code. What I’m doing is different in that I want to add a level of interactivity to the serendipity display that looks for user patterns in how they react to the presented serendipity and incorporate that pattern into a trustworthiness evaluation of the web content. I’m also doing it in Journalism, which is a bit different in its constraints. And I’m trying to tie it back to Group Polarization and opinion drift.
  • Also, Raz Schwartx at Facebook: , Editorial Algorithms: Using Social Media to Discover and Report Local News
  • Working on getting all html and pdf files in one matrix
  • Spent the day chasing down a bug where if the string being annotated is too long (I’ve set the  number of wordes to 60), then we skip. THis leads to a divide by zero issue. Fixed now

Phil 6.13.16

6:30 – 2:30 VTX

Phil 6.9.16

6:00 – 12:00 Writing

  • Going to go through the RQs and describe how to address them
  • Start with the back end and my local cohort, which I can assume to be diversity-seeking because of where they are.
  • Iteratively develop tool so that it gets used for diversity-related activities
  • Logs and questionairres.
  • Scraping for Google Scholar and CaseLaw? Java code is here.
  • Looks like Google Scholar has also started to add the concept of pertinence in?
  • Finished the Research Plan. Do need a timeline.
  • Finished discussion/conclusion. Done(ish)!

Phil 6.4.16

7:30 – 1:30 Writing

  • More on libraries and serendipity. Found lots, and then went on to look for metions in electronic retrieval. Found Foster’s A Nonlinear Model of Information-Seeking Behavior, which also has some spiffy citations. Going to take a break from writing and actually read this one. Because, I just realized that interdisciplinary researchers are the rough academic equivalent of the explorer pattern.
  • Investigating Information Seeking BehaviorUsing the Concept of Information Horizons
    • Page 3 – To design and develop a new research method we used Sonnenwald’s (1999) framework for human information behavior as a theoretical foundation. This theoretical framework suggests that within a context and situation is an ‘information horizon’ in which we can act. For a particular individual, a variety of information resources may be encompassed within his/her information horizon. They may include social networks, documents, information retrieval tools, and experimentation and observation in the world. Information horizons, and the resources they encompass, are determined socially and individually. In other words, the opinions that one’s peers hold concerning the value of a particular resource will influence one’s own opinions about the value of that resource and, thus, its position within one’s information horizon. 

Phil 5.31.16

7:00 – 4:30 VTX

  • Writing. Working on describing how maintaining many codes in a network contains more (and more subtle) information than grouping similar codes.
  • Working on the UrlChecker
    • In the process, I discovered that the annotation.xml file is unique only for the account and not for the CSE. All CSEs for one account are contained in one annotation file
    • Created a new annotation called ALL_annotations.xml
    • fixed a few things in Andy’s file
    • Reading in everything. Now to produce the new sets of lists.
    • I think it’s just easier to delete all the lists and start over.
    • Done and verified. You run UrlChecker from the command line, with the input file being a list of domains (one per line) and the ALL_annotations.xml file.
  • https://cwiki.apache.org/confluence/display/CTAKES/cTAKES+3.2
  • Need to add a Delete or Hide button to reduce down a large corpus to a more effective size.
  • Added. Tomorrow I’ll wire up the deletion of a row or cilumn and the recreation of the initialMatrix

Phil 5.27.16

7:00 – 2:00 VTX

  • Wound up writing the introduction and saving the old intro to a new document – Themesurfing
  • Renamed the current document
  • Got the parser working. Old artifact settings.
  • Added some tweaks to show progress better. I’m kinda stuck with the single thread in JavaFx having to execute before text can get shown.
  • Need an XML parser to find out what sites have already been added. Added an IntelliJ project to the GoogleCseConfigFiles SVN file. Should be able to finish it on Tuesday.