D20:
- Talk to Zach about chart size bug?
- We are going to need a top level dashboard, something like number of countries in the DANGE, WARNING, and CONTROLLED buckets
- Should look into using scipy’s linregress method to get accuracy values – done!
COVID Twitter
-
- Read xls files into db (using this)
- Starting with reading the xls files into a Pandas Dataframe
- Wow, you can recursively get files in three lines, including the import:
import glob for filename in glob.iglob("./" + '**/*.xls', recursive=True): print(filename)
- Had to do a bunch of things to get Arabic to score correctly. I think I need to set the database to:
alter database covid_misinfo character set utf8 collate utf8_general_ci;
, then set the table to utf-8, like so
DROP TABLE IF EXISTS `table_tweets`; /*!40101 SET @saved_cs_client = @@character_set_client */; /*!40101 SET character_set_client = utf8 */; CREATE TABLE `table_tweets` ( `GUID` bigint(20) NOT NULL, `date` datetime NOT NULL, `URL` varchar(255) DEFAULT NULL, `contents` mediumtext NOT NULL, `translation` varchar(255) DEFAULT NULL, `author` varchar(255) DEFAULT NULL, `name` varchar(255) DEFAULT NULL, `country` varchar(255) DEFAULT NULL, `city` varchar(255) DEFAULT NULL, `category` varchar(255) DEFAULT NULL, `emotion` varchar(255) DEFAULT NULL, `source` varchar(255) DEFAULT NULL, `gender` varchar(16) DEFAULT NULL, `posts` int(11) DEFAULT NULL, `followers` int(11) DEFAULT NULL, `following` int(11) DEFAULT NULL, `influence_score` float DEFAULT NULL, `post_title` varchar(255) DEFAULT NULL, `post_type` varchar(255) DEFAULT NULL, `image_url` varchar(255) DEFAULT NULL, `brand` varchar(255) DEFAULT NULL, PRIMARY KEY (`GUID`) ) ENGINE=InnoDB DEFAULT CHARSET= utf8;
Anyway, it’s now working! (RT @naif_khalaf رحلة تطوير لقاح وقائي لمرض كورونا. استغرقت ٤ سنوات من المعمل لحيوانات التجارب للدراسات الحقلية على الإبل ثم للدراسة السريرية الأولية على البشر المتطوعين. ولازالت مستمرة. https://t.co/W3MjaFOAoC)
- Read xls files into db (using this)
GOES
- Write, visualize, and query test data
-
-
- Writing seems to be working? I don’t get any errors, but I can’t see anything show up
- Here’s an example of the data in what I think is correct line format:
measure_1, tagKey_1=tagValue_11 val_1=0.0 1586270395 measure_1, tagKey_1=tagValue_11 val_1=0.09983341664682815 1586270405 measure_1, tagKey_1=tagValue_11 val_1=0.19866933079506122 1586270415 measure_1, tagKey_1=tagValue_11 val_1=0.2955202066613396 1586270425 measure_1, tagKey_1=tagValue_11 val_1=0.3894183423086505 1586270435 measure_1, tagKey_1=tagValue_11 val_1=0.479425538604203 1586270445 measure_1, tagKey_1=tagValue_11 val_1=0.5646424733950355 1586270455 measure_1, tagKey_1=tagValue_11 val_1=0.6442176872376911 1586270465 measure_1, tagKey_1=tagValue_11 val_1=0.7173560908995228 1586270475 measure_1, tagKey_1=tagValue_11 val_1=0.7833269096274834 1586270485 measure_1, tagKey_1=tagValue_11 val_1=0.8414709848078965 1586270495 measure_1, tagKey_1=tagValue_11 val_1=0.8912073600614354 1586270505
Here’s how I’m writing it:
def to_influx(self, client:InfluxDBClient, bucket_name:str, org_name:str): write_api = client.write_api(write_options=SYNCHRONOUS) for i in range(len(self.measurement_list)): t = self.measurement_list[i] for key, val in self.tags_dict.items(): p = Point(self.name).tag(key, val).field(self.keyfield, t[0]) write_api.write(bucket=bucket_name, record=p) print("writing {}, {}={}, {}={} {}".format(self.name, key, val, self.keyfield, t[0], t[1]))
That seems to work. Here’s the output while it’s storing:
writing measure_10, tagKey_1=tagValue_101, val_10=-0.34248061846961253 1586277701 writing measure_10, tagKey_1=tagValue_101, val_10=-0.2469736617366209 1586277691 writing measure_10, tagKey_1=tagValue_101, val_10=-0.1489990258141953 1586277681 writing measure_10, tagKey_1=tagValue_101, val_10=-0.04953564087836742 1586277671 writing measure_10, tagKey_1=tagValue_101, val_10=0.05042268780681122 1586277661
I get no warnings or errors, but the Data Explorer is blank:
- Oh, you have to use Unix Timestamps in milliseconds (timestamp * 1000):
mm.add_value(val, ts*1000)
- Ok, it’s working, but my times are wrong
-
-
- 1:00 IRAD meeting
ML Seminar
- Good chat on Neural Ordinary Differential Equations