-
Notifications
You must be signed in to change notification settings - Fork 131
Backport ccm main #1296
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Backport ccm main #1296
Conversation
|
ad0c3aa
to
6e14276
Compare
Possible improvements / API changes after a brief glance at the code:
|
Notice that I implemented all of this when we had old certificates in the repository. We had to update them, because of the errors thrown by Why old certs worked for
|
Moving such workflows to CCM has 2 advantages:
If we move only part of it, we get neither. So I'll cherry pick commits that move auth, and skip TLS for now. We can do that in the future. Btw is it possible to use domain names instead of ip addresses with scylla? In other words, can we have domain names instead of ip addresses in system.peers in driver-relevant columns? |
scylla can use hostnames, but then you need a dns server to map them. I think generating certs as needed is the best approach, and also give the flexibility to try more variants as needed. |
Is there functionality in CCM to generate certs? Or do we have to do it other way? If Scylla can use hostnames, then we should test it too. @fruch one other question for you. Could you describe (or point to documentation if such exists) what exactly wait-related flags do in CCM, and how do they interact if I specify more than one? |
you are more then welcome to document it. |
I'd be happy to make a PR that improves descriptions, but I would have to first understand those options myself. |
6e14276
to
aeabc5e
Compare
I backported the commits that move auth to CCM. I also removed TLS support from CCM for now. |
Marking as ready. The way I see it the only improvement I can make here is better CCM API - which needs input from others, which is basically a review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We hold nodes in Arc<RwLock<>>, and methods that give user nodes return that. Maybe we could return refs / mut refs and get rid of Arc<RwLock<>>? I'm not sure.
As of now, there is no use case for Arc<RwLock<>>
- I think we can return refs/mutrefs for now. We could always revert this in the future. It also simplifies the API - I believe NodeList
is no longer necessary then. Instead, we can expose nodes_iter_[mut]()
and get_node_[mut]_by_id
methods on Cluster
.
Currently, append_node()
and add_node()
methods return Arc<RwLock<>>
. They could return node id instead.
|
aeabc5e
to
d546b53
Compare
I have one more idea: we can split off another file from cluster.rs, I would call it |
d546b53
to
f05e4dc
Compare
I did this for 2 commands as an experiment, in additional commit. I like the new version, so unless anyone has different opinion I'll convert the rest of the command to this. @dkropachev I see that both |
b605f97
to
a02254c
Compare
I love the idea. The code in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❓ cluster.rs
contains a lot of functions which are not used and whose goal is not always clear to me. Are we going to merge them as-is, only later wondering about their intended use and possibly adding documentation?
db70e01
to
8253f0d
Compare
New version of the PR. Finalized the move to a separate command builders and made a lot of other changes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks much better.
fn append_node(&mut self, node_options: NodeOptions) -> &mut Node { | ||
let node_name = node_options.name(); | ||
let node = Node::new(node_options, self.ccm_cmd.for_node(node_name)); | ||
|
||
self.nodes.push(node); | ||
self.nodes.0.last_mut().unwrap() | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like that we return &mut Node
here. (previously that was Arc<RwLock<_>>
, correct?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure tbh.
8253f0d
to
3e4b32a
Compare
Addressed Mikolaj's comment. |
.PHONY: use_cargo_lock_msrv | ||
use_cargo_lock_msrv: | ||
mv Cargo.lock Cargo.lock.bak | ||
mv Cargo.lock.msrv Cargo.lock | ||
|
||
.PHONY: restore_cargo_lock | ||
restore_cargo_lock: | ||
mv Cargo.lock Cargo.lock.msrv | ||
mv Cargo.lock.bak Cargo.lock | ||
|
||
.PHONY: test_cargo_lock_msrv | ||
test_cargo_lock_msrv: use_cargo_lock_msrv check restore_cargo_lock |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🔧 Can you please add comments explaining use case of these commands?
pub(crate) fn wait_options(mut self, options: Option<NodeStartOptions>) -> Self { | ||
self.wait_opts = options; | ||
self | ||
} | ||
pub(crate) fn scylla_smp(mut self, smp: u16) -> Self { | ||
self.scylla_smp = Some(smp); | ||
self | ||
} | ||
|
||
pub(crate) fn scylla_mem_megabytes(mut self, mem_megabytes: u32) -> Self { | ||
self.scylla_mem_megabytes = Some(mem_megabytes); | ||
self | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❓ Why does wait_options()
accept Options
and the other two functions accept non-Option
s?
let mut ip_port = ip_hex.split(':'); | ||
if let Some(ip_hex) = ip_port.next() { | ||
if let Ok(ip) = u32::from_str_radix(ip_hex, 16) { | ||
let first_octet = ip as u8; | ||
if first_octet == 127 { | ||
used_ips | ||
.insert(LocalSubnetIdentifier((ip >> 8) as u8, (ip >> 16) as u8)); | ||
} | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should likely be ip_addr
, not ip_port
.
/// The constructor scans /proc/net/tcp for busy local subnets (127.x.x.x/24). The subnet is busy, | ||
/// if there is at least one listener on any port and any address in this network. | ||
pub(super) fn new() -> Result<Self, Error> { | ||
let mut used_ips: BTreeSet<LocalSubnetIdentifier> = BTreeSet::new(); | ||
let file = File::open("/proc/net/tcp").context("Failed to open /proc/net/tcp file")?; | ||
let lines = BufReader::new(file).lines(); | ||
for line_res in lines { | ||
let line = line_res.context("Failed to read a line from /proc/net/tcp")?; | ||
let parts: Vec<&str> = line.split_whitespace().collect(); | ||
if let Some(ip_hex) = parts.get(1) { | ||
let mut ip_port = ip_hex.split(':'); | ||
if let Some(ip_hex) = ip_port.next() { | ||
if let Ok(ip) = u32::from_str_radix(ip_hex, 16) { | ||
let first_octet = ip as u8; | ||
if first_octet == 127 { | ||
used_ips | ||
.insert(LocalSubnetIdentifier((ip >> 8) as u8, (ip >> 16) as u8)); | ||
} | ||
} | ||
} | ||
} | ||
} | ||
|
||
Ok(Self { used_ips }) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I refactored the whole function:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// The constructor scans /proc/net/tcp for busy local subnets (127.x.x.x/24). The subnet is busy, | |
/// if there is at least one listener on any port and any address in this network. | |
pub(super) fn new() -> Result<Self, Error> { | |
let mut used_ips: BTreeSet<LocalSubnetIdentifier> = BTreeSet::new(); | |
let file = File::open("/proc/net/tcp").context("Failed to open /proc/net/tcp file")?; | |
let lines = BufReader::new(file).lines(); | |
for line_res in lines { | |
let line = line_res.context("Failed to read a line from /proc/net/tcp")?; | |
let parts: Vec<&str> = line.split_whitespace().collect(); | |
if let Some(ip_hex) = parts.get(1) { | |
let mut ip_port = ip_hex.split(':'); | |
if let Some(ip_hex) = ip_port.next() { | |
if let Ok(ip) = u32::from_str_radix(ip_hex, 16) { | |
let first_octet = ip as u8; | |
if first_octet == 127 { | |
used_ips | |
.insert(LocalSubnetIdentifier((ip >> 8) as u8, (ip >> 16) as u8)); | |
} | |
} | |
} | |
} | |
} | |
Ok(Self { used_ips }) | |
} | |
/// The constructor scans /proc/net/tcp for busy local subnets (127.x.y.z/24). The subnet is busy | |
/// if there is at least one listener on any port and any address in this network. | |
pub(super) fn new() -> Result<Self, Error> { | |
let mut used_ips: BTreeSet<LocalSubnetIdentifier> = BTreeSet::new(); | |
let file = File::open("/proc/net/tcp").context("Failed to open /proc/net/tcp file")?; | |
let mut lines = BufReader::new(file).lines(); | |
let _header_line = lines | |
.next() | |
.context("Failed to read the header line from /proc/net/tcp")?; | |
for line_res in lines { | |
let line = line_res.context("Failed to read a line from /proc/net/tcp")?; | |
line.split_whitespace() | |
.skip(1 /* ordinal number */) | |
.next() // Get second column - the local address. | |
.and_then(|ip_addr_hex| ip_addr_hex.split_once(':')) | |
.and_then(|(addr_hex, _port_hex)| u32::from_str_radix(addr_hex, 16).ok()) | |
.inspect(|&ip| { | |
let first_octet = ip as u8; | |
if first_octet == 127 { | |
used_ips.insert(LocalSubnetIdentifier((ip >> 8) as u8, (ip >> 16) as u8)); | |
} | |
}); | |
} | |
Ok(Self { used_ips }) | |
} |
_ => return Err(anyhow::anyhow!("Ipv6 addresses are not yet supported!")), | ||
}; | ||
let subnet_id: LocalSubnetIdentifier = ipv4.into(); | ||
|
||
if !self.used_ips.remove(&subnet_id) { | ||
return Err(anyhow::anyhow!( | ||
"IP prefix {} was not allocated - something gone wrong!", | ||
ip_prefix | ||
)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
♻️ consider using anyhow::bail!
as an idiomatic way.
tracing::info!( | ||
"{:15} -> failed to wait on child process: = {}", | ||
format!("exited[{}]", run_id), | ||
e | ||
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❓ Just to make sure I undestand correctly: would the following be equivalent?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tracing::info!( | |
"{:15} -> failed to wait on child process: = {}", | |
format!("exited[{}]", run_id), | |
e | |
); | |
tracing::info!( | |
"exited[{:7}] -> failed to wait on child process: = {}", | |
run_id | |
e | |
); |
3e4b32a
to
1564879
Compare
This allows us to use `expect` lint level.
Cleanup unused dependancies
ccm module will contain tests that require ccm. It's lib submodule will contain the ccm integration. Why do it this way - which is different than what we did during a hackathon? - Old way required ugly hacks to share test utils between integration test targets, and those hacks did not work well with rust-analyzer. - One target means better compilation time CCM tests will be guardded by a cfg, so we will still be able to run the subset that we want: - All tests: run integration tests with the required cfg - Only CCM tests: as above, but filter by ccm folder - Only non-ccm test: run without the cfg
Co-authored-by: Mikołaj Uzarski <mikolaj.uzarski@scylladb.com> Co-authored-by: Dmitry Kropachev <dmitry.kropachev@gmail.com>
For now it will run on each PR. If at some point it becomes too slow we can switch it to running manually and before release.
1564879
to
b183deb
Compare
Co-authored-by: Mikołaj Uzarski <mikolaj.uzarski@scylladb.com>
Auth tests are now run as a part of CCM test suite.
7c6d135
to
a939123
Compare
@muzarski @wprzytula This is a draft of the CCM backport into main branch.
I mostly took the current state from branch-hackathon. One important change I made is to remove separate
ccm-integration
test target. Instead ccm tests are now a module inintegration
.This avoids the problem with sharing utils. It should also be quicker to compile - no need to link 2 separate binaries.
Apart from that I did not really modify CCM integration. Now the question is: what do we do with it.
Are we satisfied with the API? Probably not.
If not, what should the API be like?
This is fully internal to the crate, so we can change it freely, so there is no need to spend too much time on it - we can always improve later.
Still, we should retain some reasonable level of quality, so I'd like to discuss this a bit.
I did not yet pick up @muzarski 's changes that integrated auth and TLS workflows into CCM. I can do that after we agree to the rest of this PR.
On the above matter @muzarski : How should I adapt CCM given that we now support multiple TLS backends?
I see we have in mod.rs DB_TLS_CERT_PATH, DB_TLS_KEY_PATH and CA_TLS_CERT_PATH (which I removed for now because it was guarded by old feature name).
I also see that on branch-hackathon you wrote a TLS test. We should probably make a test per backend, right?
What about those vars? Do they make sense for all the tests? In that case we just need to change feature guard on CA_TLS_CERT_PATH to be activated when any backend is active - or to even make it always active because why not.
Pre-review checklist
./docs/source/
.Fixes:
annotations to PR description.