3manuek

It is not down on any map; true places never are. -- Moby Dick by Sensi Seeds

PgIbz 2019
PgIbz 2019 -- This event has been held in Ibiza, at the Convention Center and it was organized by the PostgreSQL Foundation. This pure-PostgreSQL event hosted several talks, among I lectured “Pooling Performance”. The code used for this presentation will be public soon as I’m currently rewriting the benchPlatform into a more sophisticated tool. Slides can be found here.......
Google Cloud TCP Internal Load Balancing with HTTP Health Checks in Terraform for stateful services
GCP iLB and Terraform integration general considerations Implementing an Internal Network Load Balancer in GCP through HCL (Terraform) requires to place a set of resources as lego pieces, in order to make it work inside your architecture. We are excluding the external option in this post as it is not oftenly being use for stateful services or backend architectures such as databases, which is the concern here. Also, its Terraform implementation vary in between strongly, e.g. certain resources such as the Target Pool aren’t used in the internal scheme mode, making the autoscaling configuration tied differently with its counterpart. It is recommended a full read of Google Cloud load balancing documentation Setting up a Load Balancer will depend on which resources have been choose for spinning the computes. That is, google_compute_region_instance_group_manager, google_compute_instance_group_manager or single computes. In this particular post I’m going to stick to google_compute_region_instance_group_manager for the sake of abstraction. The iLB as shown in the current post, points to the node (through the Backend Service) that return OK to its corresponding Health Check (internal mechanics of this are commented in the Health Check section bellow ). Differently from a stateless fleet, for stateful services, only one node can hold the leader lock for receiving writting transactions.......
Nerdear.la 2018
Great event to attend if you are around Buenos Aires in October: Nerdear.la. It helds talks of a ranged variety of topics, high-level to hardcore IT talks. I lectured (in Spanish) about the Relational Databases: the good, the bad and the ugly ......
Clickhouse sampling on MergeTree engine.
Why sampling is important and what you need to be aware of? When dealing with very large amount of data, you probably want to run your queries only for a smaller dataset in your current tables. Specially if your dataset is not fitting in RAM. MergeTree is the first and more advanced engine on Clickhouse that you want to try. It supports indexing by Primary Key and it is mandatory to have a column of Date type (used for automatic partitioning). Is the only engine that supports sampling, and only if the sampling expression was defined at table creation. So, the rul of the thumb is that if the dataset does not fit in RAM you will prefer to create the table with sampling support. Otherwise, there is no performance gain by using sampling on relatively small tables that fit in RAM. Sampling expression uses a hash function over a chosen column in order to generate pseudo randomly data on each of the selected columns defined in the primary key. Then you can enable this feature by accesing the data using the SAMPLE clause in the query. Values of aggregate functions are not corrected automatically, so to get an approximate result, the value ‘count()’ is manually multiplied by the factor of the sample.......
Percona Live 2017
Percona Live 2017 Been held at Santa Clara, CA, I lectured here about one the most significant features in Postgres, Logical Replication: ......