A cluster of clusters. In the cloud.

Network Security in Mozilla.

Created by Michal Purzynski / @michalpurzynski

Scripts are here - https://github.com/michalpurzynski

whoami

Part of the team doing enterprise information security

We don't do product security

We monitor our infrastructure

We respond to security investigations and incidents

We help developers design and implement security controls

We build tools & services to keep users secure

"A human wireshark"

Why cluster of clusters?

HostsCluster
5SCL3 production
6SCL3 special projects
2SCL3 full packet capture
2Suricata with 17 000 rules
8Offices (soon, 5 now)
XAWS (lots)

Challenge number one

Lots of traffic

Lots of traffic

60Gbit/sec from taps

Not everything is actionable

Lots of spare capacity - because you know, DDoS

Use a packet broker

Challenge number two

Tons of logs

Tons of logs

40 000 000 connections per hour

70 000 000 log lines per hour

80GB of compressed logs per day

After filtering. Or multiply by 5.

Cut traffic and logs without mercy

Petabytes of logs you cannot search?

Attackers move fast

Several hours to query is not an option

Identify what is most important

Measure log lag

Huge lag disables your detection! - monitor it

Tons of logs

  • conn.log - 53/udp, 53/tcp, 123/udp, 137/udp, 161/udp, 5355/udp
  • conn.log - ^RSTO|^S0$|^SH$|^SHR$
  • conn.log - dynamic list of IP to ignore (DDoS)
  • dns.log - your own domains, monitoring services
  • files.log - some MIME types sent from you (application/pkix-cert)

Collect metrics with SumStats

Tons of logs

  • http.log - Monitor-UA in rec$user_agent
  • mysql.log - only for subnets you expect no MySQL traffic
  • ssl.log - ignore some IP addresses (dynamic)
  • x509.log - you know your certificates

Collect metrics with SumStats

Challenge number three

Expect everything

Expect everything

DGA Nope, research.

Bruteforcing? Nope, monitoring.

Connections to bad sites? Making Firefox secure.

Challenge number three

We have people everywhere

Our people are everywhere

When working from home is part of your code

You have this, too - people travel a lot

Entice them with a travel VPN

Many offices all over the world?

Each has an independent cluster

One host large "cluster"

Correlation in MozDef across the ocean

Room for improvement - broker, anyone?

A global world cluster

A global world cluster

  • Manager roles elected
  • Lost manager? Promote standby, Kafka style
  • Standby keeps state at all times
  • Lost connection? Local correlation
  • SumStats capable

Challenge number four

Surprising requests

Surprising requests

Visibility "behind" the firewall

Specific details of HTTP sessions

DDoS analysis for mitigation

Better radius logs than FreeRadius

We need all the traffic between A and B

For... science

Did you say science?

I did

Detection of SSL MiTM - Firefox updates

Challenge number five

We want it, too

We want it too

"THAT is AMAZING"

"How do I..."

...write code without breaking production

AWS

AWS networking - EC2 classic

  • No VPC
  • No concept of subnets
  • Public IP everywhere

AWS networking - VPC

  • VPC - an illusion of a Data Center
  • Your own subnets
  • Private and public IP
  • Routing
  • Every piece of networking separated

VPC - traffic flows

VM initiated traffic will go through Bro

Traffic to a public IP of a VM won't

ELB traffic bypasses Bro

VPC - traffic flows

The NAT instance

A standard Linux VM with source/destination checks off

Roll your own, with any distribution

If it goes down, you have a problem

Autoscalling patched with scripts is kind of ugly

AWS NAT gateway

No need to care about HA - it's there

Take it as it is - no way to roll your own

You don't get a shell. And NSM.

Very limited - no way to intercept traffic

Your sysadmins will love it. You will not. Risk management FTW.

Our NAT instance

Even more challenges

NAT instance is your cluster. Another one.

Need to deliver logs somehow, somewhere.

You don't get a shell.

Needs to fix-itself.

Connection split in kernels < 4.2

Even more challenges

Scripts building rpms (fpm FTW).

Pre/Post-install scripts setup structures.

Bake an AMI.

BroDog.

Traffic flows with Bro

https://github.com/michalpurzynski

@MichalPurzynski