Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add user interactions autocapture #186

Closed

Conversation

PouriaAmini
Copy link
Contributor

@PouriaAmini PouriaAmini commented Jul 1, 2024

Summary

This PR adds support for the auto-capture of user interactions with UI elements. The design is heavily based on the accessibility metadata available within the app for each UI element, which is mostly provided by the UI framework. Auto-capture of user interactions works for apps built with both UIKit and SwiftUI. To enable user interaction auto-capture, set the userInteractions parameter of the DefaultTrackingOptions initializer to true:

let amplitude = Amplitude(configuration: Configuration(
  apiKey: "API_KEY",
  defaultTracking: DefaultTrackingOptions(userInteractions = true)
))

New Automatically Captured Event

[Amplitude] User Interaction: Emitted when a user interacts with a UI element.

Note: To capture the event, the UI element must include some accessibility metadata (accessibilityLabel, accessibilityValue, and/or accessibilityTraits). Otherwise, the interaction event will be generic and will not contain information about the UI element with which the user interacted.

New Automatically Captured Properties

  • [Amplitude] Element Label: The accessibility label of the UI element the user interacted with. If empty, the property will be excluded.
  • [Amplitude] Element Value: The accessibility value of the UI element the user interacted with. If empty, the property will be excluded.
  • [Amplitude] Element Type: The accessibility traits of the UI element the user interacted with. If empty, the property will be excluded.
  • [Amplitude] Interaction: The type of the user interaction on the UI element. These interactions include:
    • Tap
    • Dead Tap: When a user clicks on an inactive button
    • Long Press
    • Dead Long Press
    • Rage Tap
    • Focus Gained (for text fields)
    • Focus Lost (for text fields)
    • Focus Lost After Text Modification (for text fields)
    • Value Changed To -% (for sliders)

PR Note: Tests targeting these implementations and Objective-C support will be included in a separate PR to reduce the size of this PR.

Checklist

  • Does your PR title have the correct title format?
  • Does your PR have a breaking change?: no

@PouriaAmini PouriaAmini self-assigned this Jul 1, 2024
@@ -7,131 +7,137 @@

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of the changes to this file are to correct the indentation.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please check in any style related changes separately.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do.

Comment on lines +14 to +19
private var application: UIApplication? {
// TODO: Check if lifecycle plugin works for app extension
// App extensions can't use UIApplication.shared, so
// funnel it through something to check; Could be nil.
UIApplication.value(forKeyPath: "sharedApplication") as? UIApplication
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wrapped the logic to get the UIApplication.shared in a computed property since UIApplication.shared is not available during the app startup.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

? That should not be the case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've tried accessing the application instance later when the app starts and with the previous code, the instance is always nil.

Copy link
Collaborator

@crleona crleona left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1/ Especially on your first few commits to a new language / project, please try to keep PRs small - this is too much code to effectively review in one go.
2/ I am concerned with the overall capture approach - while it is nice that it works for both Swift and SwiftUI, most apps do not put a lot of thought into a11y, and we are ignoring much of the context we can get from a closer integration with UIKit. Things like respecting gesture recognizer failure hierarchies are important to understanding which events actually fired.

Comment on lines +27 to +34
public convenience init (
sessions: Bool = true,
appLifecycles: Bool = false,
screenViews: Bool = false,
userInteractions: Bool = false
) {
self.init(sessions: sessions, appLifecycles: appLifecycles, screenViews: screenViews)
self.userInteractions = userInteractions
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason you are adding a new convenience initializer vs just adding userInteractions to the default initializer?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was to bypass an error related to the new parameter of the original initializer. I'll look into a different way to fix that.

Comment on lines +37 to +60
extension UIAccessibilityTraits {
func stringify() -> String? {
var strings = [String]()
if contains(.adjustable) { strings.append("Adjustable") }
if contains(.allowsDirectInteraction) { strings.append("Allows Direct Interaction") }
if contains(.button) { strings.append("Button") }
if contains(.causesPageTurn) { strings.append("Causes Page Turn") }
if contains(.header) { strings.append("Header") }
if contains(.image) { strings.append("Image") }
if contains(.keyboardKey) { strings.append("Keyboard Key") }
if contains(.link) { strings.append("Link") }
if contains(.notEnabled) { strings.append("Not Enabled") }
if contains(.playsSound) { strings.append("Plays Sound") }
if contains(.searchField) { strings.append("Search Field") }
if contains(.selected) { strings.append("Selected") }
if contains(.startsMediaSession) { strings.append("Starts Media Session") }
if contains(.staticText) { strings.append("Static Text") }
if contains(.summaryElement) { strings.append("Summary Element") }
if contains(.tabBar) { strings.append("Tab Bar") }
if contains(.updatesFrequently) { strings.append("Updates Frequently") }

return strings.isEmpty ? nil : strings.joined(separator: ", ")
}
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's not a lot here that's informative in any sort of analytical sense, we should not collect this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that some of them are unnecessary, but traits like button, link, tab, and search field can provide semantics info about the element that was clicked. Is that something that we would want to capture?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think, in many cases, this data will be used to help uniquely identify an auto captured event - something like "signup button clicked".
So, if we have more specific indicators like an accessibility label, I'm not sure this is useful to capture. If information is a lot more sparse and this would be a useful indicator in telling one event from the other, then the above approach is probably the way to go.

@@ -7,131 +7,137 @@

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please check in any style related changes separately.

@@ -0,0 +1,23 @@
{
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is probably a relic from including a local Amplitude-Swift in another project - please do not check this in.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do.


import UIKit

internal final class TextFieldDelegateWrapper: NSObject, UITextFieldDelegate {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overriding delegates is extremely error prone and is not a technique I am comfortable releasing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One other approach I tried to capture when a text field gains/loses focus was through figuring out location of the user clicks, but it had many edge cases. Is there a better approach that overriding the delegate?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does UITextfield fire any relevant control events?


// MARK: - Life Cycle

init(for captureDelegateHandler: @escaping () -> UserInteractionCaptureDelegate?) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are you passing around lazy accessors like this? There is almost assuredly a better way to accomplish whatever you're trying to work around.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was trying to get a weak reference to self. A capture list within a closure seemed to be the only way to do this. Is there a better way to accomplish this? (the weak reference here was to prevent CustomUIGestureRecognizers and UserInteractionCaptureDelegate have circular reference.)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


// MARK: -

internal final class GlobalUIPressGestureRecognizer: UIGestureRecognizer {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you actually using anything from UIGestureRecognizer in this class?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope, I can remove UIGestureRecognizer in this case.

/// The accessibility targets within `keyWindow`.
var accessibilityTargets: [AccessibilityTarget] {
guard let keyWindow else { return [] }
return accessibilityHierarchyParser.parseAccessibilityElements(in: keyWindow)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you need to return the whole accessibility hierarchy when you only ever use it to find a single view?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call! Looking at it again, I think we can just get the first element under the tap - my initial concern was that multiple elements might be tapped with a single tap, but that doesn't seem to be an issue anymore since I'm already getting the first one.

// MARK: - Private Properties

/// The activation delta between the initial touch point and the final touch point.
private static let pressActivationDelta: Double = 10
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is generally not what the system regards as a "press", please see the documentation for UIControl.Event.touchUpInside.

Copy link
Contributor Author

@PouriaAmini PouriaAmini Jul 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This delta seems to be the minimum difference between the initial and final touch points since anything more might cause a scroll or swipe if the touch was on a scrollable element. Although this might not detect some presses, it will avoid false positives where the change in touch positions was for a scroll, but a press is detected.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For something like UIButton, the default touchUpInside behavior would detect a touch down, swipe anywhere in bounds, then touch up over any period of time as a single event. I think auto tracking will replace manually firing an event from the default handler (ie, addTarget(_:action:for:) for UIButton), so we should get as close to this behavior as possible.

let startTime,
let initialTarget,
let initialRootView,
let endTouchLocation = touches.first?.location(in: initialRootView),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if there are multiple overlapping a11y elements? how would the system resolve this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first element from the hierarchy parser is the top most element.

@PouriaAmini
Copy link
Contributor Author

1/ Especially on your first few commits to a new language / project, please try to keep PRs small - this is too much code to effectively review in one go. 2/ I am concerned with the overall capture approach - while it is nice that it works for both Swift and SwiftUI, most apps do not put a lot of thought into a11y, and we are ignoring much of the context we can get from a closer integration with UIKit. Things like respecting gesture recognizer failure hierarchies are important to understanding which events actually fired.

Thanks @crleona for the review!
1/ I'll convert this into a draft and try to break this code into multiple PRs.
2/ The primary reason why I was lenient towards extracting analytics from accessibility metadata was that many of the built-in component in SwiftUI already have descriptive accessibility data - for example any button that has a label will have an accessibility label by default. The primary reason for opting for global gesture recognition than swizzling concrete subclasses of UIGestureRecognizer was that many of the SwiftUI components don't use UIGestureRecognizer.

@PouriaAmini PouriaAmini marked this pull request as draft July 2, 2024 00:16
Copy link
Collaborator

@Mercy811 Mercy811 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we merge this pr to a feature branch like we discussed?

@@ -83,6 +83,7 @@ public struct Constants {
static let AMP_APPLICATION_BACKGROUNDED_EVENT = "\(AMP_AMPLITUDE_PREFIX)Application Backgrounded"
static let AMP_DEEP_LINK_OPENED_EVENT = "\(AMP_AMPLITUDE_PREFIX)Deep Link Opened"
static let AMP_SCREEN_VIEWED_EVENT = "\(AMP_AMPLITUDE_PREFIX)Screen Viewed"
static let AMP_USER_INTERACTION_EVENT = "\(AMP_AMPLITUDE_PREFIX)User Interaction"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we change the name to be "User Interacted" to follow the pattern of other default tracking event names?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can discuss with Alan to align on the naming.

@PouriaAmini
Copy link
Contributor Author

Can we merge this pr to a feature branch like we discussed?

@Mercy811 - I'm thinking that since we are going to break this PR into chunks, it would minimize its impact on the SDK. So I'm not sure if having a feature branch would make the introduction of the new feature more safe. Unless we want to test the implementation with a set of customers first before merging in the feature. I'm open to further discussion.

public var shape: Shape

/// The object representing the accessibility node.
public var object: NSObject
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like these properties are not meant to change after initialization. If so, it makes more sense to use let instead of var.

Comment on lines +57 to +59
// MARK: - Life Cycle

public init() {}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we delete it if it doesn't perform any additional setup

@crleona
Copy link
Collaborator

crleona commented Jul 2, 2024

2/ The primary reason why I was lenient towards extracting analytics from accessibility metadata was that many of the built-in component in SwiftUI already have descriptive accessibility data - for example any button that has a label will have an accessibility label by default. The primary reason for opting for global gesture recognition than swizzling concrete subclasses of UIGestureRecognizer was that many of the SwiftUI components don't use UIGestureRecognizer.

It does sound like this may be as good as we can get in SwiftUI, so we should keep this approach as a fallback - perhaps enabled by a different config if we ever run into trouble with loading a11y libs. I'd like to see where we can get with a UIKit only approach as well - there should be a lot more context we can gather, and we can get a closer mapping of tracked events to events that the app is actually taking action on.

@PouriaAmini PouriaAmini closed this Jul 2, 2024
@PouriaAmini PouriaAmini removed their assignment Jul 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants