Sunday, November 14, 2021

Storing secrets in git repository

Sometime you may want to store secrets in git repository and keep track of changes history. For example you may want to store a production configuration or any other sensitive information.

You may already know that git-crypt may be your friend in this case. But what you can do in case if you developing software using Windows platform like I'm doing this? That's may be a problem to use this nice tool on Windows OS.

You may find binaries for git-crypt somewhere on the internet, but would you trust to such binaries? I believe not.

What can we do in this case? We can build our own binary and test it on our repository.

In this article we will build such binary and will do few experiments to check that it will work for us.

Sunday, February 14, 2021

Python: improving time required to load 1.5G CSV file

Python: improving time required to load 1.5G CSV file

In previous post we discussed how we can search for specific item in 1.5G CSV file with 132M records. And we improved search from 73625.309 ms (more than a minute) to just ~0.005 ms - almost 15 million times faster. Which is pretty impressive improvement as per my understanding.

But there is still one bottle neck that can be improved - time required for first initial scan of the file. Let's try to improve this in this article.

All source files can be found in JFF-Bohdan/item_lookup

Python: playing with big lists (132M records), checking if item in list

Python: playing with big lists (132M records), checking if item in list

Let's imagine situation when we need to check if item is in list and our list is pretty big. For example, we may have file with hundreds of millions records and we need to develop solution which should be able quickly say if we have specific item in that list or not.

Let's start with naive implementation and then try to improve it iteratively.

This post was inspired by post https://habr.com/ru/post/538358/