Phil 11.13.15

7:00 – 3:30 SR

  • Brought in the flex code and gave to Al.
  • Al thought that I had not brought in all the Java code and was going to have me produce a spreadsheet stating where all the jar files could be found. In reality, I had brought in all the jar files, but the import in his IDE hadn’t done a proper job and he never looked inside the zip file to see what was there. Sigh.
  • Al also doesn’t have the System diagrams that I sent him through Bill on November third. Verifying that they got there. If they didn’t, I’ll burn a new disk and bring that in.
  • Told the folks on site that I was taking off the next 4 weeks to work on my thesis, with a high probability that I would be put on a new position before I got back, since there’s no full time work for me on SR any more apparently (Steve F’s words)
  • Wrote up a description for Lenny to do the truancy reports
  • Burned disks that have the entire Viztool project on them. Gave one to Joe and Bill gets the other one to pass on to Al.

Phil 11.12.15

7:00 – 3:00 SR

  • More training
  • Check ingest manager?
  • Pull SVN info and re-burn disk
  • Query Builder uses a view called __project_detailed_data. This view is a join of many tables to view the financial data in a complete way. Remember the huge query to generate the view? All the queries used in Query Builder are stored in qb_queries table and managed by Query Builder.

    If you do SELECT * FROM __project_details_data, you would get 598 columns… Each row in the view represents a complete data for REQ in the FA all the way thru Appropriation Year 4..

Phil 11.9.15

7:00 – 3:00 SR

  • Training
  • Got all the Java files built and burned to disk the main problem that I had was getting a Tomcat runtime instance showing up. Here was the fix: http://stackoverflow.com/questions/2000078/apache-tomcat-not-showing-in-eclipse-server-runtime-environments

Phil 11.6.15

Burning vacation today

Working on the dictionary server code

  • Make sure that adding a parent doesn’t cause a loop.
  • Adding UpdateStatementObject helper class since I seem to be doing that a lot. And of course it took a few hours longer than I thought it would, but it’s nice and general.

Phil 11.5.15

7:00 – 4:00 SR

  • Meeting with financial analysts.
  • Work on downloading and compiling projects from the current repository.
  • Need to update the old dev box that was set up to do Flash Development. Can’t get a new copy of FB 4.6.
    • Now on update 46 of 61… Ok Done
    • Tried installing Dong’s server code on Luna – didn’t recognize the folders as Java projects. He was using Mars (4.5.0) Trying that, and following Dong’s notes.

Phil 11.4.15

7:00 – 3:30 SR

  • And we have more confusion on what’s happening. Still going through the process of bringing everything inside.
  • Helped Al a bit on how money moves around in the system
  • Set up Al in the integration scripting system.
  • Long discussions about requirements

Phil 11.3.15

7:00 – 5:30 SR

  • I’ve decided I don’t like heading home in the dark, so I’m going to stay on daylight savings time. Or at least give it a shot. 4:45 am looks very early on my clock….
  • Getting the SW documentation over to Al.
    • For some reason the System diagrams weren’t in the SVN repo. Fixed that and sent a zip file over to Bill.
  • Status report!
  • Meeting with Al and Lenny about future work. I literally have no idea if we should just set everything up for maintenance or build a new NLP-based search engine for financial questions. Hopefully Lenny can get some answers.
  • Meeting at Infotek. Al is now the lead. I am to package everything up for deployment. Future work will be on some other vehicle.
  • Trying to get FB 4.7 running, but it’s hanging on the launch screen while thrashing the CPU. Fortunately it was just a test. Pulling down everything from the new repository to build on FB 4.6. Verifying that FB 4.6 should work
  • Setting up a subversion server

11.2.15

8:00 – 5:00 SR

  • Al came on board today. Showed him around the system, discovering that the scripting system wasn’t working on the production server. Fixed that, and downloaded a copy of the documentation for him to look at. Also gave him accounts on the integration server for him to poke around.
  • Fixed the Reqonciler bug. Had to insert the modified query directly into the reqonciler table to get around odd quote-escaping issues.
  • Updated Friday’s work from the repo. Updated the database and ran the term extraction and dictionary tests.
  • Working on dictionary access methods.
    • Got AddEntry and Remove Entry working. Also removed the tn_dictionary table and stuck the dictionary_id in the tn_dictionary_entries table.
    • Added cascade entry/modification of parent if it doesn’t exist. Otherwise the indices won’t work.

Phil 10.30.15

8:00 – 4:00 SR

  • Working from home today, waiting for people to show up.
  • Here’s the fix for the Reqonciler issue:
    • Open Reqonciler in your browser.
    • click Post-Processing button to see all queries
    • double click the one that you disabled this morning to edit, Order 2100, update month 1 year 2 to 100% from month 12 year 1
    • add ” AND NOT ISNULL(bc.uid)” at the end of the query without the double quotes. Make sure there is a space before.
    • Save, run, and check the data
  • In the process of getting my home dev environment working again. I swear I should just do this once a week so it’s less stressful.
    • Fixed the Imagick load so that there is a test for the extension and whether the extension is installed correctly.
    • Disabled the world wide web service so that apache could run on port 80
    • Updates all the files in the Apache htdocs directory. Forgot that I had updated the server access methods to take an object.
    • It occurs to me that I can load up the DB directly on the server if I don’t get everything done with the dictionary by Wednesday.
  • Examine AlchemyNLP and see if there is a hierarchy that can be used. Not without a lot of work.
  • Buy and download the fivefilters term extractor and see how to integrate.
    • Ordered. Waiting for confirmation to show up.
    • Installed. Time to see if it’ll work. It looks good, though possibly slow? starting to put together a dictionary class to examine more deeply.
  • Add dictionary Flyout directive
    • Name the dictionary
    • Choose the networks (add/remove from list) ()
    • Input html, text or url
    • Get the clean text and show the machine extracted terms. We could look up potential definitions too – from wordnik. Set up an account and applied for a developer key.
    • Show a list of selected terms with checkboxes
      • Checked items can be deleted or grouped
      • Items can be added by typing into a field
    • Show a list of ‘group items’.  This displays a list of the items who’s index appears in the ‘parent’ field
      • Selecting an item in this list reorders the item list to show the appropriate group first
  • There should also be a select dictionary option on the network flyout

Phil 10.29.15

8:00 – 4:30 SR

  • Sent Dong screenshots of the issue. He’s checking queries and code now.
  • Added simpleTests($dbObj) to each class in AlchemyNLP
  • Added ‘skill’ ‘capability’ and  ‘task’ as parents in the dictionary
  • Add flyout directive to create and assign dictionaries and entries.
  • Set the dictionary to zero in the networkDbIo.addNetwork()  PHP code and add the dict_id to the typescript interface. Done
  • Make sure that an association between a keyword and another item is always from the keyword. Otherwise PageRank won’t calculate correctly. Done.
  • Chain up the dictionary and add parent keywords to the network (parents point to children). That way, for example, all ‘skills’ can be elevated, while all ‘tasks’ can be suppressed. Done
  • Changed keywords to be ‘editable’ so they have adjustable link weights. It does make the keywords in the network editable as well. May need to just add a slider to ITEMS of certain types. Still need to think about this…
  • Next step is to buy and download the fivefilters term extractor and see how to integrate?

Phil 10.28.15

8:00 – 5:00 SR

  • Walked through the FA bug with Dong on the phone. Took some screenshots that I will send over tonight.
  • Add a DictionaryText class that uses a passed-in tag list to determine what items to create associations to. Low edit-distance matches get added to the item. Possibly the keyword list can be hierarchical?
  • Add a tn_dictionary table with fields for word, type (optional), description (optional), server_code (optional), parent (optional), and user_id. Multiple users can have different versions of the same word. When a new word is entered, the content of the network is rescanned and items that contain the keyword link to it. We will need to know which definition is being used in the network, since it will point to the master item. – Done, except for the last part
    • The server_code field would include scripts/regexes or something similar that could do special text scanning. This would require the use of eval, for example. In the db, but not used.
  • So now, when an external query is made, only items from the result that contain words in the dictionary will be added to the network. Done and working in the DB and PHP!Done and working in the DB and PHP!
  • There should also be a ‘resubmit’ button that looks for new material while running the stored queries. TODO
  • It’s possible to use NLP, particularly five filter’s, to create a strawman dictionary as a starting point. TODO
  • Meeting with Dr. Pan
    • There are different contexts that a keyword dictionary needs to be aware of. Resumes have skills, tasks and achievements. Scientific papers have contributions and methods, financial data has budget centers, companies, clients, invoices, etc.
    • Phrases add specificity, single words can be very noisy.

Phil 10.27.15

8:00 – 5:00 SR

  • Still chasing down the Reqonciler issues.
  • Need to put together a list of software that I have installed on my dev box and send to Lenny.
  • Still working on the EXPLICIT entry. Added two manual items directly in the PHP and it’s working.
    • Also see if I can get HTML parsed and displayed – done
    • Based on that paper, I’m going to try keywords again. OK, I have keywords, but they’re really key phrases and as such, too unique? Here’s a screenshot: dump
    • There are things that I think should correlate, like web-based, status, senior and visualization. There’s also ‘analysis’ and ‘analysys’ that should be matched by edit distance. So there are some options:
      • Try a different NLP. Open Calais is free enough for what I’m doing, and does provide different parsing. FiveFilters has a PHP implementation that is 20 Euros, and actually seems to work best for what I’m looking for (items linked by keywords)?
      • Or a naive kind of tagging that does some naive keyword extraction. These could be presented as potential tags to be checked by the users?

Phil 10.26.15

8:00 – 10:00, 12:00 – 3:00 SR

  • Query Builder is not capturing all of the COGNOS data correctly. Specifically for FY15 the second year data is not being added into the queries and for FY14 the 3rd year of data is not being added into the queries. FY16 which is in year 1 is working correctly.
  • I’ve checked the tables in the database and it looks like everything is going in correctly (e.g. the cognos_obligations table has timestamps from today). So I don’t know if there is a problem with ingest or with the FA queries.
  • Working on getting all the pieces working in the EXPLICIT item case. I have an issue where the ‘query’ field is used for a lot of things – I’m definitely going to have to clean that up. Probably the best thing to do is to align the database and the GUI better.
    • When getting the text for the item to show as HTML, I got an angular ‘sanitizer unable to parse’ error that I couldn’t figure out. After trying
      sceProvider.trustAsResourceUrl
    • in the directive without success, I went back (slowly) through the html. It turned out to be the difference between this:
      $htmlStr .= '<img src="assets/'.$this->rssImage.'" height="80"></a></td>';

      and this:

      $htmlStr .= '<img src="assets/'.$this->rssImage.'"height="80"></a></td>';

      Didja see it? It’s the space before “height=80”. Chrome handled it when I was working it out, but angular chokes.

Phil 10.23.15

8:00 – 4:00 SR

  • Things look good for the switchover
  • Getting the PHP right for items from text
    • Fixed the author parsing
    • Fixed image save
    • Fixed guid generation for manual add. Alchemy likes my resume. I wonder what makes that more parse-able?
    • Will need to generate some default values for the items used to generate the guid.
  • Add some new tn_types for tn_items so we know what the source is? Ether that or add a ‘source’ field….
    • ANLP_
    • USER_
    • GNEWS_
    • etc
  • 3:00 switching production servers.
    • Looks like everything works.