Patch 10.0.1 – ECS Fargate and SSO
This was considered a Major Patch deserving of the 10.0 Version Number because we finally migrated our entire TLDCRM System from AWS EC2 to AWS ECS Fargate. We also heavily modified the Login System to Support SAML SSO (Single Sign On) as well as more robust Login Logs and Session Management.
AWS ECS Migration – TLDCRM Containerized
We’ve spent the past 6 months or so preparing for this migration and it happened last week! We had a bit of a Snafu with some file uploads, but outside of that it went very smoothly.
Non Document File Uploads have been fixed and the filesystem is working correctly now. With ECS Fargate Orchestration, we no longer have to manage the servers themselves, just the containers. Amazon handles the servers and where the containers run automatically on their platform. This is known as “Serverless Compute”. Not that there aren’t actually any servers, but that the client (us) don’t have to manage the servers, they theoretically don’t exist, or at least are not accessible to us. Tasks are given a set amount of reserved CPU and RAM and Network Traffic. This is kind like being able to run Multiple Operating Systems on one machine, that share the underlying operating system, but don’t necessarily interact with that operating system so it means we can run the Container’s OS Very efficiently and with minimal applications and bloat installed. (Like Removing crappy apps from windows installs you never wanted)
Think of this like Virtual Servers, if you every have heard of someone running Windows on a Mac, or Linux on Windows, it’s a similar concept, except it strips out all the fluff to run the servers.
Fargate manages what servers the tasks runs on based off our requirements for CPU and RAM. It also handles Auto Scaling much better (adding more containers based on traffic patterns and removing them as well). We are even able to segregate specific TLD Endpoints to different services entirely so for example of our External API’s are getting hammered, it won’t affect users using the CRM as greatly.
- More Secure.
- Servers Updates are Managed by Amazon and always up to date, the server updates do not conflict with the Container configurations.
- when a server needs updating, amazing handles it automatically, decommissions the server it was running on, but first replaces the container on a different server automatically.
- Auto Scaling and Deployment Speeds.
- Previously it took 30-40 minutes to push a code change, or add more servers on Elastic Beanstalk. With Tasks it takes 4 minutes. This leaves a huge gap if an error is discovered and we need to quickly patch.
- Faster Container Build Time allows us to react to increases in average traffic more swiftly, so we can benefit from night time usage of less servers and increase during the day during peak loads (Cost Savings! A lot!) We can run for example 4 containers at night, and scale up to 30 during the day, then back down again automatically.
- Multiple Scaling Clusters
- We can Segregate API Traffic like the Dialer Ready Ping from Normal CRM usage and Traffic to separate clusters of containers.
- This is HUGE since 90% of our Traffic is the Dialer Ready Ping alone ( 3-5 million hits daily )
- We can Segregate API Traffic like the Dialer Ready Ping from Normal CRM usage and Traffic to separate clusters of containers.
- Run More Lower Power Containers
- Cheaper Smaller Servers, more Spread, more Stability, less likely for one server to get bogged down.
- More Performant.
- During Testing we found we were able to push 2 – 3x more traffic on lower Powered containers than our current EC2 Servers in Elastic Beanstalk.
- We tested pushing traffic beyond what can take down the EC2 Servers, and it survived, with lag but without fully going down.
We are super happy with the performance thus far and are till fine tuning our autoscaling. We hope you see better system performance!
Other AWS Migrations
We have been hard at work migrating everything and we are nearing the close for all Non-Dialer Related Migrations.
Migrations Complete
- TLDCRM Main Platform – From ElasticBeanstalk to ECS Fargate
- Domains and Route53 – Routing
- EFS – File Storage
- SES – For MFA Emails
- ElastiCache – Memory Store for Caching
- WAF – Web Application Firewall
Migrations Pending
- RDS – The Database
- S3 – Buckets of Files and Backups
- CDN – Content Delivery Network for Static Files
- TLDNode – NodeJS (Javascript) Backend Applications and Websockets
- TLDialer Clusters
- This one will be the most difficult and final project as it includes a Dialer Update / Refresh and OS Upgrade and possibly some major rearchitecting.
Hope this gives you some insight as to what has been going on in the background here while we prepare for a hopefully smooth Open Enrollment!
SAML SSO
Tired of MFA? Hate having to deal with opening your email to get a code to login? Well, if you have a SSO IDP ( Identify Provider ) like Google Workspace, Microsoft Azure, or Okta and many others, you can now use TLD with SAML SSO to login with one click to TLD as long as the email for the SSO User Matches a User in your TLD Account!
“Wait, but I have Multiple SSO providers! some of my guys use Google and some use Okta!” No Problem! We wrote our SSO Integration to support Multiple SSO Providers on a single account. This way if you are using the Agency Module with multiple Agencies they can each have their SSO or continue to login with Email Multi-factor for the Luddites out there.
In the Settings -> Fields Section you can add a “SSO Container” to you your account. You will have to be a Super administrator to add or manage these. The Configuration Panel looks like the below:
As you can see there is a lot of text here explaining what you should do. There are also advanced tabs and configurations for more esoteric or weird SAML SSO Integrations.
Generally speaking, the 4 Blank Fields on this main page are all you need to configure. Make sure your SSO is using Email for the Name ID. From our end all you have to do is copy the Assertion Consumer Service URL (ACS URL) add it to your SAML SSO Provider on your end.
Once you have done this you can then Enable the SAML Containers for your Account. This can be found in Settings -> Access under the SSO Tab as seen below
Once you have Checked off a SAML Provider, it then will show a Button on your Login Page to be able to login with. You can customize the Button Look and Feel in the Buttons Tab in the SAML Container.
By Default, Users cannot log into SSO unless either the Unrestricted Flag is Set to On, the User has the Unrestricted Flag on his User under the Access Tab, or the User has the SSO Provider Selected in the User’s Access Tab.
The 3 Settings Seen here also exist in the Users Profile under the Access Tab.
Note: Account Level Settings Override User Level Settings.
This is what the following Settings Do:
- SSO Only
- Only Allow SSO Logins (Disable Credentials Login )
- This Prevents Users from Logging in with their Username and Password
- Users can Only SSO
- SSO Unrestricted
- Allow Login from All Account Enabled SSO Providers
- This Bypasses the Multi-checkbox selections on the users and allows all users on your account to login with Any SSO that is Selected in the Account Panel
Remember, These Settings can be Applied to Individuals or to your entire Account so choose depending on your needs!
Email Verification Requirements
For New Users, there is still a requirement to Verify their Email Address once, or on Email Change, if they want to SSO.
Once the Email is Verified you will not have to MFA into the system as your IDP Handles the Access Controls!
Password Reset Requirements
If you have SSO Only Enabled, then the User will also not need to Update their password per Password Change Policies because Login via Password is Disabled and you are controlling Password Requirements on your IDP.
This makes it SO Easy to Login and use TLD!
Coming Soon: Mass Add / Remove SSO Providers
Login Logs & Sessions – Major Revamp
Due to the Addition of SSO we took a pass over the Login Logs and Sessions and made the following Changes
- Added the Following Columns to Login Logs
- success: 1 or 0 if Login was Successful
- date_logout: The Date the User Login Session Logged Out.
- logout_reason: The Reason for Logout, system generated, “LOGOUT”, “IP_RESTRICTION”, “EMAIL_UNVERIFIED”, “MOVE”, “DEACTIVATED” etc.
- sso: The SSO Container ID when using SSO
- sso_account: The SSO Account the Container ID is Related to.
- provider: “Credentials”, or the Name / Label of the SSO Provider at the Time.
- logout_id: The Login ID From Which the Session or Sessions initiated a Logout
- ender_id: The User ID Responsible for Logging the User Out, as it can be initiated by an Administrator
- ip_logout: The IP Address the Logout Took Place
- Added the Following Columns to User Sessions
- sso: The SSO Container ID
- sso_account: The SSO Account for SSO Container
- login_id: The Login ID the Session Started From
When Logging into TLD your Session will now have a Direct Correlation to the Login Log that was successful so we can more easily see which sessions are actually active.
On Logout, the Date Logout, Ender ID=, IP Logout and Logout Reason are automatically set.
The Logout ID is the Login ID that the Logout was initiated. If it was Admin Initiated, we use the Latest Successful Unlogged out Login ID.
In the case of multiple sessions being active, The Logout ID will be the same on all of them.
There will be more updates to this in the future as we see the data moving around!
File Uploads Fixed
A Couple of days after the switch to ECS we had some system wide issues that passed testing since we didn’t have as many ECS Tasks running. The root of the issue is that we had changed the Root Directory for our EFS Volume and some of the code assumed the volume was in a specific spot so it’s either not uploading or uploading to a local task instead of the shared drive, which is why it disappears on refresh at times, we’ve updated the codebase to support the new mount point.
Areas that were Affected by this Issue with Inconsistent Results that have been Fixed
- Commission Alignments Uploader
- Lead Uploader
- Lead Deduper
- JSON to CSV Converter Tool
- Audio to VICI Audio Converter Tool
- Spreadsheet to VICI CSV Converter tool
- Spreadsheet with JSON Column Data to CSV Converter Tool
- Google Audio Sample Generator
- Download All for Recordings
- Internal JSON Logs for Query Errors and System Errors
Other Changes and Fixes
- Fixed Encryption Toggle for Dialer Ready URL Generator.
- Some Webhook Path Updates have been added for the Upcoming Health Sherpa Integration.
- Removed old TLDSIP Download Link. It’s no longer supported.
- JSON Array Queries should now be fully supported with TQL.
- Added Campaign ID Override for TLDialer Lead Hopper Send