Phil 4.7.20


  • Talk to Zach about chart size bug?
  • We are going to need a top level dashboard, something like number of countries in the DANGE, WARNING, and CONTROLLED buckets
  • Should look into using scipy’s linregress method to get accuracy values – done!

COVID Twitter

    • Read xls files into db (using this)
    • Wow, you can recursively get files in three lines, including the import:
      import glob
      for filename in glob.iglob("./" + '**/*.xls', recursive=True):
    • Had to do a bunch of things to get Arabic to score correctly. I think I need to set the database to:
      alter database covid_misinfo character set utf8 collate utf8_general_ci;

      , then set the table to utf-8, like so

      DROP TABLE IF EXISTS `table_tweets`;
      /*!40101 SET @saved_cs_client     = @@character_set_client */;
      /*!40101 SET character_set_client = utf8 */;
      CREATE TABLE `table_tweets` (
        `GUID` bigint(20) NOT NULL,
        `date` datetime NOT NULL,
        `URL` varchar(255) DEFAULT NULL,
        `contents` mediumtext NOT NULL,
        `translation` varchar(255) DEFAULT NULL,
        `author` varchar(255) DEFAULT NULL,
        `name` varchar(255) DEFAULT NULL,
        `country` varchar(255) DEFAULT NULL,
        `city` varchar(255) DEFAULT NULL,
        `category` varchar(255) DEFAULT NULL,
        `emotion` varchar(255) DEFAULT NULL,
        `source` varchar(255) DEFAULT NULL,
        `gender` varchar(16) DEFAULT NULL,
        `posts` int(11) DEFAULT NULL,
        `followers` int(11) DEFAULT NULL,
        `following` int(11) DEFAULT NULL,
        `influence_score` float DEFAULT NULL,
        `post_title` varchar(255) DEFAULT NULL,
        `post_type` varchar(255) DEFAULT NULL,
        `image_url` varchar(255) DEFAULT NULL,
        `brand` varchar(255) DEFAULT NULL,
        PRIMARY KEY (`GUID`)

      Anyway, it’s now working! (RT @naif_khalaf رحلة تطوير لقاح وقائي لمرض كورونا. استغرقت ٤ سنوات من المعمل لحيوانات التجارب للدراسات الحقلية على الإبل ثم للدراسة السريرية الأولية على البشر المتطوعين. ولازالت مستمرة.


  • Write, visualize, and query test data
        • Writing seems to be working? I don’t get any errors, but I can’t see anything show up
        • Here’s an example of the data in what I think is correct line format:
          measure_1, tagKey_1=tagValue_11 val_1=0.0 1586270395
          measure_1, tagKey_1=tagValue_11 val_1=0.09983341664682815 1586270405
          measure_1, tagKey_1=tagValue_11 val_1=0.19866933079506122 1586270415
          measure_1, tagKey_1=tagValue_11 val_1=0.2955202066613396 1586270425
          measure_1, tagKey_1=tagValue_11 val_1=0.3894183423086505 1586270435
          measure_1, tagKey_1=tagValue_11 val_1=0.479425538604203 1586270445
          measure_1, tagKey_1=tagValue_11 val_1=0.5646424733950355 1586270455
          measure_1, tagKey_1=tagValue_11 val_1=0.6442176872376911 1586270465
          measure_1, tagKey_1=tagValue_11 val_1=0.7173560908995228 1586270475
          measure_1, tagKey_1=tagValue_11 val_1=0.7833269096274834 1586270485
          measure_1, tagKey_1=tagValue_11 val_1=0.8414709848078965 1586270495
          measure_1, tagKey_1=tagValue_11 val_1=0.8912073600614354 1586270505

          Here’s how I’m writing it:

          def to_influx(self, client:InfluxDBClient, bucket_name:str, org_name:str):
              write_api = client.write_api(write_options=SYNCHRONOUS)
              for i in range(len(self.measurement_list)):
                  t = self.measurement_list[i]
                  for key, val in self.tags_dict.items():
                      p = Point(, val).field(self.keyfield, t[0])
                      write_api.write(bucket=bucket_name, record=p)
                      print("writing {}, {}={}, {}={} {}".format(, key, val, self.keyfield, t[0], t[1]))

          That seems to work. Here’s the output while it’s storing:

          writing measure_10, tagKey_1=tagValue_101, val_10=-0.34248061846961253 1586277701
          writing measure_10, tagKey_1=tagValue_101, val_10=-0.2469736617366209 1586277691​
          writing measure_10, tagKey_1=tagValue_101, val_10=-0.1489990258141953 1586277681​
          writing measure_10, tagKey_1=tagValue_101, val_10=-0.04953564087836742 1586277671​
          writing measure_10, tagKey_1=tagValue_101, val_10=0.05042268780681122 1586277661

          I get no warnings or errors, but the Data Explorer is blank: influxdb

        • Oh, you have to use Unix Timestamps in milliseconds (timestamp * 1000):
          mm.add_value(val, ts*1000)
        • Ok, it’s working, but my times are wrong wrong_times


  • 1:00 IRAD meeting

ML Seminar