Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -10,8 +10,7 @@ Operational Maintenance

# Operational Maintenance

Once Privilege Secure is rolled out there are various maintenance and operational items that should
be done at regular intervals:
Once Privilege Secure is deployed, perform the following maintenance and operational tasks at regular intervals:

## Privilege Secure Processes (suggested daily)

Expand All @@ -22,27 +21,25 @@ be done at regular intervals:

![image.webp](/images/privilegesecure/4.2/discovery/admin/maintenance/360053539334_image.webp)

- The manager status should be listed as "Leader" for a single node or "Leader" and "Reachable" as
above. The Leader is the primary manager node and the Reachable nodes are managers. In the event
- The manager status shows "Leader" for a single node or "Leader" and "Reachable" for a cluster. The Leader is the primary manager node and the Reachable nodes are managers. In the event
of a failure of a node an election to determine a new primary manager takes place. A single node
can fail without any indication to users.

## Disk Maintenance (suggested weekly)

- Disk space should be checked on a periodic basis. The command to check this is: df -h
- This command is disk free with the switch h for human readable. Disk usage over 80% should be
checked to see if there are any specific reasons for excessive disk usage. If this the space can
not be reduced successfully, additional disc space should be added. The Customer Success team can
- Check disk space periodically. The command is: df -h
- This command is disk free with the switch h for human readable. Investigate disk usage over 80% for specific causes of excessive usage. If the space can't be
reduced successfully, add additional disk space. The Customer Success team can
assist with the review and reduction of used disc space and also increase of disk space (physical
or virtual, dependent on environment).

## Backups

Review (suggested weekly)

- Check that backups are being executed on the production node and successfully copied to DR.
- Confirm that backups with appropriate recent date exist on DR server. Backups are usually executed
from a cron run script daily.
- Check that backups are running on the production node and copying successfully to DR.
- Confirm that backups with appropriate recent date exist on DR server. A cron script typically runs
backups daily.
- If the backup is set to restore the database backup daily, check the logs for any errors.
- Production Primary Node:
- ls -lath /secureone/data/db/backups
Expand All @@ -51,39 +48,36 @@ Review (suggested weekly)

Test (suggested minimum yearly)

- It is recommended to test DR at least yearly. This can be part of a wider DR test or specific to
- Test DR at least yearly. This can be part of a wider DR test or specific to
Privilege Secure.
- A test window should be created for this as changes to Privilege Secure DR would not be propagated
back to the main production instance.
- Create a test window, since changes to Privilege Secure DR don't propagate back to the main production instance.
- After a test ensure that services are down on the DR node. This will avoid the DR environment
overriding changes from Production. The command below can be utilized for this:
overriding changes from Production. Use the following command:

- `s1 stop expire worker scanner`

## Run Quickstart to Review Protect Mode and Persistence (suggested weekly)

- Quickstart should be run frequently to ensure that all machines remain in protect mode. The output
from Quickstart can be filtered for any machines that do not show in protect mode. This
ensures that the build process and addition or protect mode is being executed as expected.
- Run Quickstart frequently to ensure that all machines remain in protect mode. Filter the Quickstart
output for any machines that don't show in protect mode. This ensures that the build process and
addition of protect mode runs as expected.
- Quickstart can also be used for a review of persistent access. This should be minimized to service
accounts. A review to look for user accounts set to persistent should be carried out. If these are
accounts. A review to look for user accounts set to persistent should be performed. If these are
truly required to be persistent, then they should be switched to a service account.

## Privilege Secure Log Review - SIEM (suggested weekly)

- Typically a SIEM solution is best placed to report any issues that can be captured in Privilege
Secure logs.
- An example set of queries for Splunk is included here:
[Splunk and SIEM Queries](../../integrations/siem/splunkqueries.md)
- The "change system policy" output should be reviewed for any removal of protect mode.
- The "slow JITA access" and "time it takes for JITA access" can give an indication if users are
being slowed down in their ability to elevate privilege when utilizing Privilege Secure.
- Typically, a SIEM solution best reports issues that Privilege Secure logs capture.
- For Splunk query examples, see [Splunk and SIEM Queries](../../integrations/siem/splunkqueries.md).
- Review the "change system policy" output for any removal of protect mode.
- The "slow JITA access" and "time it takes for JITA access" indicate whether users experience slow
privilege elevation when using Privilege Secure.

## Mongo Health (suggested weekly)

Time Sync

- Run the following command to check that the servers are remaining in sync with time:
- Run the following command to check that servers remain in sync:
- sudo docker exec -it $(sudo docker ps | grep mongo | cut -d' ' -f1) mongo SecureONE --eval
'rs.printSlaveReplicationInfo()'
- The result should show that there is little or no difference in time:
Expand All @@ -99,7 +93,7 @@ Time Sync
source: mongo3:27017
syncedTo: Tue Sep 22 2020 16:03:34 GMT+0000 (UTC)
0 secs (0 hrs) behind the primary
- In the event of a variation in time, contact the Remediant Customer Success team.
- If there is a variation in time, contact the Remediant Customer Success team.

Mongo status

Expand Down Expand Up @@ -131,7 +125,10 @@ sudo docker exec -it $(sudo docker ps | grep mongo | cut -d' ' -f1) mongo Secure
## Version and OS updates (suggested quarterly minimum)

- Ensure quarterly meeting to review updates to Privilege Secure version and OS patches/updates.
- Patches can be applied monthly. The Customer Success team can be contacted prior to updates to
- Patches can be applied monthly. The Customer Success team can be contacted before updates to
ensure no known issues.

**See also:** [Docker Credentials Helper](../../installation/dockercredentials.md) — secure
credential storage for pulling private registry images during NPSD deployment and upgrades.


Original file line number Diff line number Diff line change
Expand Up @@ -16,8 +16,8 @@ update require a reboot of the server or Docker service.

There are different options to fit your environment or downtime tolerance.

- Cluster:  No downtime
- Single-Node:  Downtime during reboot
- Cluster: No downtime
- Single-Node: Downtime during reboot

This is a guide to updating the OS on Privilege Secure node with that best practice in mind.
Privilege Secure will be offline for 30-60 minutes.
Expand All @@ -29,7 +29,7 @@ Privilege Secure will be offline for 30-60 minutes.

## Use Case: Cluster In-Place (1 node at a Time, No Downtime)

Check DB replication status, stateStr should be PRIMARY or "SECONDARY"; and replication time
Check DB replication status, "stateStr" should be "PRIMARY" or "SECONDARY"; and replication time
difference, a few seconds is acceptable:

```
Expand All @@ -39,7 +39,7 @@ mEvl

```

**NOTE:** If the database replication is not in healthy state, resolve that before continuing.
**NOTE:** If the database replication isn't in healthy state, resolve that before continuing.

Primary node only: Check Privilege Secure services and nodes status:

Expand All @@ -53,12 +53,12 @@ Check for the swarm leader:
s1 nodes | grep Leader
```

Only if the node being updated has a MANAGER STATUS of leader, run the below command on a different
node to change the swarm Leader.  Replace `<hostname>` with the hostname of the node being upgraded
and run the command below.
Only if the node being updated has a MANAGER STATUS of leader, run the following command on a
different node to change the swarm Leader, replacing `<hostname>` with the hostname of the node
being upgraded:

```
HNupg=`<hostname>` ;  sudo docker node demote $HNupg; sleep 10; sudo docker node promote $HNupg
HNupg=`<hostname>` ; sudo docker node demote $HNupg; sleep 10; sudo docker node promote $HNupg
```

Verify swarm leader is no longer the node being updated:
Expand All @@ -78,7 +78,7 @@ Drain node to be updated:

- sudo docker node update --availability drain `<hostname>`

Verify Availability is set to Drain with:
Verify "Availability" is set to "Drain" with:

- s1 nodes

Expand All @@ -98,8 +98,8 @@ existing configurations, setting, or file.
**Step 3 –** Reboot if required.

- sudo cat /var/run/reboot-required
- Result if reboot is required:  "\*\*\* System restart required \*\*\*"
- Result if reboot not required:  "cat: /var/run/reboot-required: No such file or directory"
- Result if reboot is required: "\*\*\* System restart required \*\*\*"
- Result if reboot not required: "cat: /var/run/reboot-required: No such file or directory"
- If required, reboot node:

- sudo reboot
Expand All @@ -114,10 +114,10 @@ existing configurations, setting, or file.

- watch s1 nodes

- Verify Availability is set to Active on updated node.
- Verify "Availability" is set to "Active" on updated node.

- Once the updated node is reachable, Ctrl+C to get out of the watch command
- Check DB replication status, stateStr should be PRIMARY or "SECONDARY"; and replication time
- After the updated node is reachable, press Ctrl+C to stop the watch command
- Check DB replication status, "stateStr" should be "PRIMARY" or "SECONDARY"; and replication time
difference, a few seconds is acceptable:
- ```
mEvl="sudo docker exec -it $(sudo docker ps | grep mongo | cut -d' ' -f1) mongo SecureONE --quiet --eval"; $mEvl 'rs.status()' | grep "name\|stateStr\|lastHeartbeatRecv\|lastHeartbeatMessage" | column -t; echo; $mEvl 'rs.printSlaveReplicationInfo()'; unset mEvl
Expand Down Expand Up @@ -154,8 +154,8 @@ existing configurations, setting, or file.

- sudo cat /var/run/reboot-required

- Result if reboot is required:  "\*\*\* System restart required \*\*\*"
- Result if reboot not required:  "cat: /var/run/reboot-required: No such file or directory"
- Result if reboot is required: "\*\*\* System restart required \*\*\*"
- Result if reboot not required: "cat: /var/run/reboot-required: No such file or directory"
- If required, reboot node:

- sudo reboot
Expand All @@ -164,4 +164,6 @@ existing configurations, setting, or file.

- s1 status; s1 nodes

**See also:** [Docker Credentials Helper](../../installation/dockercredentials.md) — secure
credential storage required when pulling private registry images during NPSD upgrades.

Loading
Loading