Skip to content

Comments

Make Telemetry API log record type generic#1098

Open
raphael-theriault-swi wants to merge 2 commits intoaws:mainfrom
raphael-theriault-swi:generic-logs
Open

Make Telemetry API log record type generic#1098
raphael-theriault-swi wants to merge 2 commits intoaws:mainfrom
raphael-theriault-swi:generic-logs

Conversation

@raphael-theriault-swi
Copy link

📬 Issue #, if available: #977

✍️ Description of changes:

This adds a generic type parameter to LambdaTelemetryRecord for logs to support structured JSON logs. I initially tried to implement the RawValue newtype based solution suggested in #977, but it turns out you can't use RawValue within #[serde(flatten)].

This is a breaking change given it introduces a new generic parameter, but IMO it's better than introducing a whole new set of duplicate types, especially given supporting managed instances may already require breakage soon.

Unlike suggested in #977, this doesn't lock users into either supporting text or JSON logs at compile time, since they can just use an untagged enum:

#[derive(Deserialize)]
#[serde(untagged)]
enum LogRecord {
    Text(String),
    Json(StructuredLogRecord),
}

I also opted against adding a new struct for JSON logs since they don't have a fixed schema, some extension authors might want to use serde_json::Value, some might want a more specialized type.

🔏 By submitting this pull request

  • I confirm that I've ran cargo +nightly fmt.
  • I confirm that I've ran cargo clippy --fix.
  • I confirm that I've made a best effort attempt to update all relevant documentation.
  • I confirm that my contribution is made under the terms of the Apache 2.0 license.

@jlizen
Copy link
Collaborator

jlizen commented Feb 13, 2026

This is a breaking change given it introduces a new generic parameter, but IMO it's better than introducing a whole new set of duplicate types, especially given supporting managed instances may already require breakage soon.

Managed instances will not require semver breaking. We are a long ways off from release 2.x.

With that in mind, do you have any thoughts about a semver-safe way to do it? I guess we just add a new API, deprecate the old, but allow continuing to use old until next release version?

@jlizen
Copy link
Collaborator

jlizen commented Feb 13, 2026

(Also, thanks for picking this up! This is definitely in an awkward spot right now)

@raphael-theriault-swi
Copy link
Author

I'm thinking maybe an opt-in feature to enable the generic might be more elegant ? Then the feature can just be removed when a future 2.x comes out.

@jlizen
Copy link
Collaborator

jlizen commented Feb 16, 2026

The opt-in feature will work, but it seems a bit confusing to have a feature that shifts runtime behavior like this, especially for such a limited feature scope. Let me get a temperature check from the other maintainers on this and follow up.

@jlizen
Copy link
Collaborator

jlizen commented Feb 19, 2026

Coming back to this -

Unfortunately a feature flag doesn't escape semver constraints, even if it is opt in (given feature unification).

Per cargo:

Enabling a feature should not introduce a SemVer-incompatible change. For example, the feature shouldn’t change an existing API in a way that could break existing uses.

ref

And then, any introduction of generics that don't have an default inference available, is breaking. Which, unfortunately, the function signatures run() and register run afoul of. The types are otherwise fine.

EDIT: Hold on a sec, are those just methods on extension? We might be able to work around it by hoisting it to be struct-wide, with a default.

@jlizen
Copy link
Collaborator

jlizen commented Feb 19, 2026

Yeah, it seems to be working ok with an extra generic on Extension:

pub struct Extension<'a, E, L, T, R = String>
 {
    extension_name: Option<&'a str>,
    ...
    _record_type: PhantomData<R>
}

(Maybe there is a way to do it without the PhantomData as well?)

And then we can hoist the telemetry where clause back to the struct definition:

impl<'a, E, L, T, R> Extension<'a, E, L, T, R>
where
    E: Service<LambdaEvent>,
    E::Future: Future<Output = Result<(), E::Error>>,
    E::Error: Into<Error> + fmt::Display + fmt::Debug,

    // Fixme: 'static bound might be too restrictive
    L: MakeService<(), Vec<LambdaLog>, Response = ()> + Send + Sync + 'static,
    L::Service: Service<Vec<LambdaLog>, Response = ()> + Send + Sync,
    <L::Service as Service<Vec<LambdaLog>>>::Future: Send + 'a,
    L::Error: Into<Error> + fmt::Debug,
    L::MakeError: Into<Error> + fmt::Debug,
    L::Future: Send,
    T: MakeService<(), Vec<LambdaTelemetry<R>>, Response = ()> + Send + Sync + 'static,
    T::Service: Service<Vec<LambdaTelemetry<R>>, Response = ()> + Send + Sync,
    <T::Service as Service<Vec<LambdaTelemetry<R>>>>::Future: Send + 'a,
    T::Error: Into<Error> + fmt::Debug,
    T::MakeError: Into<Error> + fmt::Debug,
    T::Future: Send,
    R: DeserializeOwned + Send + 'static,
{ }

@jlizen
Copy link
Collaborator

jlizen commented Feb 19, 2026

And then, it's unpleasant for the user to have to specify all the generic parameters as Extension<_, _, _, serde_json::Value>.

But, seem to have consistent inference if we update with_telemetry processor to accept an incoming type:

impl<'a, E, L, T, R> Extension<'a, E, L, T, R> {
   ...
    pub fn with_telemetry_processor<N, NS, NewR>(self, lp: N) -> Extension<'a, E, L, N, NewR> { }

}

And then you can specify it in your handler that this function receives:

async fn handler(events: Vec<LambdaTelemetry<serde_json::Value>>) -> Result<(), Error> {
   println!("{events:#}");
    Ok(())
}

#[tokio::main]
fn main() {
    let telemetry_processor = SharedService::new(service_fn(handler));

    Extension::new()
        .with_telemetry_processor(telemetry_processor)
        .run()
        .await.unwrap()
}

(Or not, and then it defaults to String, and stays backwards compatible)

@jlizen
Copy link
Collaborator

jlizen commented Feb 19, 2026

What do you think about the proposed API @raphael-theriault-swi ?

I think we should flip the generic default next major version, but that might be a ways off. (At which point, your untagged LogRecord seems nice)

Are these ergonomics adequate in the meantime? No feature flags at least?

Probably a small examples/ entry would be good too to show how to use this (or update extension-telemetry-basic).

@raphael-theriault-swi
Copy link
Author

The thing is that adding the default generic type to the enum itself is technically a breaking change the moment you try to use an enum literal of a variant that doesn't include the generic.

enum Generic<T = &'static str> {
    Generic(T),
    Fixed(&'static str),
}

fn main() {
    let generic = Generic::Generic("inference succeeds");
    let fixed = Generic::Fixed("inference fails");
}

@jlizen
Copy link
Collaborator

jlizen commented Feb 19, 2026

The thing is that adding the default generic type to the enum itself is technically a breaking change the moment you try to use an enum literal of a variant that doesn't include the generic.

That may be, but in the current approach, LambdaTelemetryRecord is using the generic on each variant, and is already pub and not #[non_exhaustive], so we couldn't add a new variant without the generic anyway, right? Or else that would be a breaking change regardless?

You can see that the semver check is only complaining about the new generics in the methods, which my suggestion avoids.

Please correct me if I am missing something!

@raphael-theriault-swi
Copy link
Author

Yeah, I do't think there's anyway to make this a totally non-breaking change without duplicating pretty large parts of the API. Personally I feel the solution you suggested in your recent replies is the best way to go, I just wanted to bring up the potential breakage.

Should I go ahead and implement it that way ?

@rcoh
Copy link

rcoh commented Feb 20, 2026

Typically, inference failure is not considered SemVer breaking: https://doc.rust-lang.org/cargo/reference/semver.html#fn-generalize-compatible

One option (I have not totally read all context in here) is to introduce a new type LambdaTelemetryRecordV2<T> (bikeshed), replacing LambdaTelemetryRecord with pub type LambdaTelemetryRecord = LambdaTelemetryRecordV2<String>

you could then add register_generic_extension<TL> that used LambdaRecordV2<TL> to allow for using the generic.

Since register_extension would delegate directly to register_generic_extension, the API itself should remain largely unchanged and the only change in surface area would be the new methods.

@jlizen
Copy link
Collaborator

jlizen commented Feb 20, 2026

I think Russell's solution is a clean one that would avoid duplication:

  • use a type alias pub type LambdaTelemetryRecord = GenericLambdaTelemetryRecord<String>
  • under the hood, remove LambdaTelemetryRecord entirely and replace it with GenericLambdaTelemetryRecord, which looks like the generic version of the original that we have been working with
  • add a new API to the builder, with_generic_telemetry_processor(), which accepts GenericLambdaTelemetryRecord for an arbitrary type
  • probably also add, JsonLambdaTelemetryRecord = GenericLambdaTelemetryRecord<serde_json::Value>, and with_json_telemetry_processor for ergonomics?

And then no inference breaks on existing LambdaTelemetryRecord construction, since LambdaTelemetryRecord is a type alias with no generics.

And usage looks fine:

# record is a string
async fn handler(events: Vec<LambdaTelemetryRecord) -> Result<(), Error> {
   println!("{events:#}");
    Ok(())
}

#[tokio::main]
fn main() {
    let telemetry_processor = SharedService::new(service_fn(handler));

    Extension::new()
        .with_telemetry_processor(telemetry_processor)
        .run()
        .await.unwrap()
}
# record is a serde_json::Value
async fn handler(events: Vec<JsonLambdaTelemetryRecord) -> Result<(), Error> {
   println!("{events:#}");
    Ok(())
}

#[tokio::main]
fn main() {
    let telemetry_processor = SharedService::new(service_fn(handler));

    Extension::new()
        .with_json_telemetry_processor(telemetry_processor)
        .run()
        .await.unwrap()
}

(and you can imagine for the generic)

What do you think about that, @raphael-theriault-swi ?

If there are blockers to this approach, I think we are probably ok to move forward with the default generic that might break inference on direct LambdaTelemetryRecord usage, with just a minor version bump (which should go in this PR for lambda-extension Cargo.toml / lambda-runtime's dependency, since I don't think cargo-semver-checks is smart enough to flag it).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants